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Executive  Summary 

\ 

Purpose 

A  Multibillion-dollar  acquisition  decisions  for  major  weapon  systems 
should  in  principle  be  based  on  the  results  of  testing  weapons  under  con¬ 
ditions  that  replicate  actual  combat.  However,  subjecting  complex  and 
expensive  weapon  systems  to  the  necessary  number  of  such  tests  is 
sometimes  impractical  or  impossible.  One  alternative  is  to  use  computer 
models  to  simulate  performance,  but  simulation  results  must  be  as  rep¬ 
resentative  of  real-world  outcomes  as  possible.  The  need  for  representa¬ 
tiveness  generates  the  major  objective  gao  addressed  in  this  report:  to 
determine,  using  three  case  studies,  that  it  is  possible  to  assess  the  credi¬ 
bility  of  simulation-generated  data.  A  second  objective  was  to  identify 
the  steps  the  Department  of  Defense  (dod)  has  taken  to  foster  the  credi¬ 
bility  of  its  simulations.^) 

gao  posed  three  major  questions:  (1)  What  factors  should  be  considered 
in  a  systematic  attempt  to  assess  the  credibility  of  a  simulation?  (2) 

What  are  the  results  of  assessing  specific  operational-effectiveness  sim¬ 
ulations  of  weapon  systems  with  respect  to  these  factors?  (3)  What 
efforts  has  the  Department  of  Defense  made  to  foster  and  reinforce  sim- 
ulation  credibility? 

Background 

dod  uses  developmental  and  operational  tests  and  evaluations  as  part  of 
a  weapon-system’s  acquisition  program  to  provide  evidence  that  the 
weapon  system  performs  as  expected  before  proceeding  through  devel¬ 
opment  phases  to  full-scale  production.  Field  tests  are  important  in 
determining  the  extent  to  which  a  weapon  system  satisfies  operational 
requirements,  but  when  such  tests  do  not  provide  sufficient  information, 
dod  often  uses  simulation  models  to  generate  supplemental  data  about  a 
weapbn’s  effectiveness.  Although  simulations  are  useful  tools,  they  are 
always  approximations  to  reality  and,  therefore,  their  credibility — the 
level  of  confidence  that  a  decisionmaker  should  have  in  their  results — is 
open  to  question. 

6ao  developed  its  own  assessment  framework  and  applied  it  to  three 
operational  effectiveness  simulations  developed  for  Army  air  defense 
system  programs:  the  Carmonette  and  adagk  computer  simulations  used 
in  the  division  air  defense  gun  (divad)  acquisition  program  and  the  oomo 
III  computer  simulation  applied  in  the  Stinger  missile  program. 

Results  in  Brief 

Using  the  framework  in  the  accompanying  table,  gao  found  that  each 
simulation  had  strong  points  but  found  weaknesses  and  limitations  that 
degraded  .their  ■er^filSiSjsF^yerely  enough  to  question  their  usefulness. 
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Area  of  concern 

Theory,  model  design,  and 
Input  data 


Factor  _ 

1 ,  Match  between  theoretical  approach  and  real  events 
being  simulated 

2.  Choice  of  measures  of  effectiveness 


3.  Portrayal  of  weapon's  immediate  combat  environment 

4  Representation  of  operational  performance 

5.  Depiction  of  critical  aspects  of  broad-  scale  battle 
environment 


6  Appropriateness  of  mathematical  and  logical 
representation 

7  Selection  ot  input  data 


The  correspondence  between  8.  Verification  effort 
the  model  and  the  real  world 

9  Attention  to  statistical  quality  of  results 


10.  Sensitivity  testing  effort 
1 1  Validation  effort 

Management  issues  12  Organizational  support 


13  Documentation 


14  Full  disclosure  of  results 


One  consistent  weakness  in  all  three  simulations  that  potentially  poses  a 
major  threat  to  credibility  is  the  limited  evidence  of  efforts  to  validate 
simulation  results  by  comparing  them  with  operational  tests,  historical 
data,  or  other  models. 

Guidance  from  the  office  of  the  secretary  of  Defense  in  the  form  of  pro¬ 
cedures  would  provide  a  structured  way  of  assessing  the  simulations' 
credibility. 


Principal  Findings 


GAO’s  Assessment  GAO’s  assessment  framework  of  14  factors  should  be  considered  in 

Framework  attempts  to  evaluate  a  simulation’s  credibility.  The  number  of  factors 

could  vary  (other  frameworks  may  contain  fewer  or  more),  but  it  is 
important  that  they  cover  the  three  mEyor  areas  of  concern:  theory, 
model  design,  and  input  data;  the  correspondence  between  the  model 
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Executive  Summary 


Assessment  of  Selected 
Simulations 


DOD  Guidance 


and  the  real  world;  and  management,  documentation,  and  reporting 
issues.  Collecting  and  analyzing  information  about  each  factor  should 
help  identify  a  simulation’s  strengths  and  weaknesses  and,  therefore,  its 
credibility,  uao’s  framework  proved  useful  for  the  three  case  study  sim¬ 
ulations  in  this  respect.  (See  pages  17-22.) 


gao  found  that  for  all  thr<*o  simulations — the  Carmonette,  adage,  and 
como  III — evidence  of  credibility  was  provided  on  only  a  few  factors: 
measures  of  effectiveness,  the  representation  of  a  weapon’s  engaging 
targets,  sensitivity  testing,  and  the  disclosure  of  strengths  and  weak¬ 
nesses  of  results.  Kven  so.  the  simulations  were  still  limited  on  these  fac¬ 
tors.  (See  pages  30.  :)4.  42.  and  51.) 

Generally,  the  principal  weakness  centered  on  the  lack  of  validation  of 
simulation  results.  Validation  can  be*  difficult,  but  it  must  be  dealt  with 
if  simulation  results  are  to  be  credible.  (See  pages  44-46.) 

For  most  factors,  the  three  simulations  varied  considerably.  For  exam¬ 
ple,  the  Carmonette  simulation  of  the  divad  was  severely  limited  in  its 
ability  to  portray  a  battle  of  area  and  duration  appropriate  for  a  divi¬ 
sion-oriented  weapon.  The  simulations  using  the  Carmonette  and  como 
treated  attrition  continuously  throughout  a  battle  with  regard  to  mathe¬ 
matical  and  logical  representation,  whereas  the  adage's  approach  only 
calculated  attrition  at  the  end  of  a  battle  period,  a  procedure  that  can 
introduce  bias.  The  effort  required  to  remove  these  limitations  and  some 
of  those  found  in  other  areas  might  be  considerable,  but  others  could  be 
corrected  with  relatively  minor  effort.  (See  pages  33,  36-37,  and  39.) 


The  Department  of  the  Army  has  been  relatively  active  in  fostering  the 
development  of  organizations  that  can  directly  influence  the  credibility 
of  simulation  results.  While  dod  officials  agree  that  credibility  is  impor¬ 
tant,  and  while  there  is  some  consensus  about  what  should  be  done  to 
achieve  such  credibility,  dod  generally  has  not  in  fact  established  the 
credibility  of  its  simulations  systematically  and  uniformly.  No  guidance 
exists  at  the  level  of  the  office  of  the  secretary  of  Defense  that  can  be 
routinely  used  throughout  dod  to  review  the  credibility  of  military  mod¬ 
els.  (See  pages  54-56.) 
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Recommendations 


Agency  Comments 


gao  recommends  that  the  secretary  of  the  Department  of  Defense  adopt 
or  develop  and  implement  guidance  on  producing,  validating,  docu¬ 
menting,  managing,  maintaining,  using,  and  reporting  simulations  of 
weapon-system  effectiveness.  This  guidance  should  include  a  way  of 
routinely  providing  reviews  of  a  simulation’s  credibility  and,  in  this 
way,  identifying  problems  that  should  be  resolved.  The  secretary  should 
explore  requiring  that  a  statement  regarding  validation  efforts  accom¬ 
pany  simulation  results. 

gao  also  recommends  that  the  secretary  of  the  Department  of  Defense 
direct  the  agencies  responsible  for  managing  the  adage,  Carmonette,  and 
Como  III  models  to  explore  the  feasibility  of  correcting  the  limitations 
gao  has  identified,  especially  the  limitations  in  validation. 


In  commenting  on  a  draft  of  this  report,  dod  generally  found  the  report 
to  be  technically  correct  and  concurred  with  gao’s  two  recommenda¬ 
tions.  It  has  sent  gao’s  factors  for  assessing  simulations  to  the  services 
for  review  and  evaluation. 

dod  raised  some  concerns  about  the  scope  and  focus  of  the  report.  One 
was  about  generalizing  from  three  cases  studies,  asserting  that  gao  did, 
indeed,  do  this  but  without  citing  specific  examples  to  support  the  asser¬ 
tion.  gao’s  purpose  was  to  demonstrate  from  case  studies  that  one  can 
systematically  collect  and  analyze  information  about  a  simulation  that 
would  permit  one  to  assess  its  credibility,  gao  did  not  intend  to  infer 
from  these  case  studies  anything  with  regard  to  the  credibility  of  other 
simulations. 

dod  also  contends  that  applying  gao’s  framework  gives  only  part  of  a 
simulation’s  picture  and  that  people,  input  data,  and  a  model's  applica¬ 
tion  are  also  important,  gao  certainly  agrees  but  points  out  that  factors 
1, 7,  and  12,  whose  importance  was  defined  in  the  draft  report,  do  con¬ 
sider  these.  (See  pages  t>2,  63,  and  242.) 

Other  technical  comments  are  found  in  non's  letter  and  comments 
reprinted  in  appendix  V. 
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Chapter  1 _ 

Introduction 


Simulation  is  a  two-phased  process  of  constructing  a  model  of  an 
existing  or  a  proposed  system  and  conducting  experiments  with  the 
model  so  as  to  understand  the  behavior  of  the  system  or  evaluate  strate¬ 
gies  for  its  operation.  A  simulation  is  more  than  a  static  picture  of  the 
system;  a  simulation  imitates  the  system’s  human  and  machine  opera¬ 
tion  or  behavior  over  time.  In  a  military  context,  simulation  can  be  a  tool 
for  analyzing  the  performance  and  operation  of  a  weapon-system  com¬ 
ponent  (for  example,  the  radar  of  a  surface-to-air  missile  system),  the 
total  weapon  system  (for  example,  the  complete  surface-to-air  missile 
system),  or  the  total  panoply  of  weapon  and  communication  systems 
(for  example,  an  air  defense  system). 

The  Department  of  Defense  (dod)  uses  development  and  operational 
testing  and  evaluation  in  weapon-system  acquisition  programs  to  pro¬ 
vide  the  evidence  that,  among  other  things,  a  weapon  system  meets  per¬ 
formance  specifications  and  can  perform  as  expected  in  realistic 
operating  conditions.  In  principle,  this  evidence  should  be  obtained 
empirically  from  developmental  and  operational  tests  for  acquisition 
decisions.  However,  as  weapon  systems  have  become  ever  more  complex 
and  expensive  and  as  attempts  to  expedite  the  acquisition  process  have 
increased,  the  willingness  and  sometimes  the  ability  to  subject  them  to 
extensive  field  testing  to  determine  their  effectiveness  and  suitability 
have  diminished  or  become  impractical.  Acquiring  the  needed  informa¬ 
tion  efficiently  during  the  acquisition  process  requires  an  appropriate 
use  of  available  methods.  Simulation  can  be  used  in  conjunction  with 
field  experimentation  and  other  analytical  methods  with  the  likely 
result  that  the  benefit  of  the  combination  will  exceed  the  benefits  of  the 
individual  methods. 

Evidence  suggests  that  dod  uses  simulation  substantially  in  the  develop¬ 
mental  and  operational  test  phases  of  the  acquisition  of  weapons.  How¬ 
ever,  questions  arise  about  the  credibility  of  simulation-generated  data 
and  dod’s  practices  for  ensuring  that  simulations  produce  sound  results. 
When  simulations  contribute  information  for  multibillion-dollar  weapon- 
system  development  and  procurement  decisions,  it  is  important  that 
they  provide  usable,  high-quality  information. 

In  this  report,  we  describe  our  development  of  a  method  for  reviewing 
simulations  of  the  operational  effectiveness  of  weapon  systems.  From 
information  from  assessment  frameworks  developed  by  other  research¬ 
ers,  we  developed  a  conceptual  framework  for  systematically  reviewing 
simulations  and  applied  it  to  selected  Army  simulations  used  in  the 
acquisition  of  air  defense  systems.  We  viewed  our  task  as  developing 
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System  Programs 


and  testing  a  review  framework  to  illustrate  how  it  can  provide  insights 
into  a  simulation’s  strengths  and  weaknesses,  especially  in  terms  of 
identifying  areas  for  improvements. 


Simulations  can  be  and  often  are  used  throughout  the  life  cycle  of  a 
weapon  system.  Simulations  are  used  frequently  in  conjunction  with 
other  analytical  methods  and  field  experimentation,  each  approach  con¬ 
tributing  to  the  understanding  of  a  weapon  system’s  functioning.  Con¬ 
tractors  and  the  developing  agencies  during  the  concept  exploration  and 
early  development  phases  of  research,  development,  testing,  and  evalua¬ 
tion  use  simulations  for  such  purposes  as 

studying  alternatives  to  a  weapon  system  by  conducting  trade-off  and 
parametric  studies, 

defining  a  system's  and  subsystem’s  requirements,  and 
determining  a  system’s  design. 

During  later  stages  of  development,  the  test  and  evaluation  agencies,  the 
operational  (or  user)  groups,  the  development  agency,  and  the  contrac¬ 
tors  use  simulations  for 

investigating  a  system’s  or  subsystem's  performance, 
identifying  its  problems  and  limitations, 
estimating  operational  effectiveness, 
determining  logistic  and  support  requirements,  and 
determining  tactics. 

dod  has  developed  and  uses  a  number  of  computer  models  that  simulate 
weapon  systems  in  combat.  Models  are  complex  computer  programs  for 
mimicking  what  happens  in  the  real  world  when  a  weapon  is  used.  Mod¬ 
els  used  for  operational  effectiveness  studies  are  ordinarily  designed  to 
simulate  more  than  one  type  of  weapon  system.  When  simulations  are 
needed  for  studies  and  analyses,  dod  may  choose  existing  models  or 
develop  new  ones.  The  development  and  maintenance  of  major  simula¬ 
tion  models  are  usually  the  responsibility  of  specific  organizational  units 
within  dod. 
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The  Credibility  of 
Simulation  Results 


The  overriding  advantage  of  simulation  is  perhaps  the  opportunity  to 
investigate  questions  and  problems  that  could  otherwise  not  be 
addressed  and  to  investigate  them  systematically  with  numerous  repli¬ 
cations  under  controlled  conditions.  In  a  simulation,  both  the  model  of  a 
system  and  the  model  of  its  environment  can  be  altered  in  an  organized 
manner.  A  model  provides  information  on  performance  under  assumed 
external  conditions  and  permits  the  investigation  of  the  system’s 
response  to  changes  in  these  conditions  and  to  changes  in  the  original 
characteristics  of  the  system  itself. 

In  addition,  experiments  can  be  performed  on  the  model  of  a  system  that 
may  not  exist  or  that  exists  only  in  limited  numbers  or  that  operates  in  a 
physical  environment  that  is  not  accessible.  Simulations  can  provide 
information  about  a  system’s  probable  performance  under  conditions 
that  cannot  be  tested  because  of  costs,  the  lack  of  adequate  equipment 
and  realistic  test  environments,  or  safety  and  security  restrictions.  Sim¬ 
ulation  allows  the  exploration  of  more  aspects  of  a  system’s  perform¬ 
ance  more  easily  than  is  available  from  field  experimentation  with  an 
actual  system.  Moreover,  the  development  of  a  model  and  the  simulation 
process  do  not  consume  or  destroy  a  weapon  system.  After  the  possible 
consequences  of  using  a  weapon  have  been  modeled,  the  results  of  simu¬ 
lations  can  be  validated  by  field  testing. 

Simulation  also  has  disadvantages.  A  model  is  an  approximation,  not  the 
equivalent,  of  a  real  system.  Inaccurate  assumptions  about  a  weapon  or 
its  environment  may  cause  the  results  of  a  simulation  to  diverge  from 
reality.  Important  variables  or  relationships  may  be  omitted,  and  appro¬ 
priate  values  for  those  that  are  included  may  be  difficult  to  obtain.  Data 
and  resources  for  validating  simulations  may  not  be  available.  Statistical 
complexities  may  obscure  the  results.  Simulations  cannot  be  better  than 
the  analysts’  understanding  of  the  concepts,  the  hardware,  and  the  rela¬ 
tionships  involved;  unasked  questions  do  not  get  answered  in  a  weapon- 
system  simulation.  Conducting  simulation  experiments  has  its  own  set 
of  problems.  For  example,  different  people  and  equipment  are  generally 
required  for  a  simulation  from  those  required  in  field-testing  the  actual 
system.  And  the  simulation  of  a  total  system  has  its  costs  in  terms  of 
development  time,  staffing,  and  computer  resources. 


Simulations  can  be  valuable  aids  for  decisionmaking,  but  there  will 
always  be  some  concern  about  drawing  the  wrong  conclusions  from 
them.  Since  simulations  arc  aostractions  or  approximations  of  the  real 
world,  questions  arise  about  their  credibility.  We  define  a  simulation's 
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“credibility”  as  the  level  of  confidence  in  its  results.  To  say  that  simula¬ 
tion  results  are  credible  implies  evidence  that  the  correspondence 
between  the  real  world  and  the  simulation  is  reasonably  satisfactory  for 
the  intended  use.  Credibility  is  not  an  absolute  condition  but  measured 
on  a  continuum. 

While  it  is  true  that  assessing  credibility  will  always  require  some  level 
of  subjective  judgment,  it  is  also  true  that  many  parts  of  a  simulation 
lend  themselves  to  scientific  and  empirical  tests  and  checks.  Any  frame¬ 
work  for  assessing  simulations,  including  the  one  we  developed,  must 
therefore  address  the  things  that  can  be  tested  as  well  as  those  that 
must  ultimately  rely  on  informed  but  judgmental  conclusions. 


Objectives,  Scope,  and 
Methodology 


In  previous  reports,  we  have  addressed  issues  regarding  simulation  eval¬ 
uation  methodology  and,  more  specifically,  the  modeling  of  weapon  sys¬ 
tems.  A  major  focus  and  objective  of  this  report  was,  using  three  case 
studies,  to  demonstrate  that  it  is  possible  to  systematically  collect  and 
analyze  information  about  a  simulation  that  would  permit  an  assess¬ 
ment  of  the  credibility  of  that  simulation  to  be  made.  A  second  objective 
was  to  identify  the  steps  dod  has  taken  to  ensure  the  credibility  of  its 
simulations.  To  meet  these  objectives,  we  sought  the  answers  to  three 
evaluation  questions: 


1.  What  factors  should  be  considered  in  a  systematic  attempt  to  assess 
the  credibility  of  a  simulation? 

2.  What  are  the  results  of  an  assessment  of  selected  weapon-system 
operational-effectiveness  simulations  with  respect  to  these  factors? 

3.  What  efforts  has  ix>n  made  to  foster  and  reinforce  the  credibility  of  its 
simulations? 


The  factors  we  identified  in  the  first  question  provide  a  framework  for 
collecting  information  about  specific  simulations.  This  framework 
allows  for  the  identification  of  a  simulation's  strengths  and  weaknesses 
with  respect  to  each  factor.  The  strengths  enhance  the  confidence  a  user 
might  have  in  the  simulation,  and  the  weaknesses  translate  into  threats 
to  that  confidence.  Further,  the  weaknesses  point  to  remedial  efforts 
that  could  increase  credibility. 

The  answer  to  the  second  question  involved  demonstrating  that  the 
framework  can  be  applied  as  a  guide  for  assessing  three  simulations  of 
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operational  effectiveness  and  identifying  areas  where  improvements 
would  reduce  threats  to  credibility.  To  answer  the  third  question,  we 
used  information  we  collected  while  performing  these  case  studies  and 
additional  data  we  collected  during  our  review. 


What  Factors  Should  Be  To  identify  the  factors  that  should  be  considered  in  a  systematic 
Considered?  attempt  to  assess  the  credibility  of  a  simulation,  we  interviewed  dod 

officials,  operations  research  analysts,  other  analysts,  and  test  engi¬ 
neers,  and  we  reviewed  literature  on  the  development  and  use  of  simula¬ 
tions.  From  this,  we  developed  a  framework  of  three  major  areas  of 
concern  and  14  factors,  which  we  describe  in  chapter  2. 


What  Are  the  Results  of  To  answer  the  question  on  the  results  of  assessing  selected  weapon-sys- 
Assessing  Simulations  tem  operational-effectiveness  simulations  with  respect  to  these  factors, 

With  These  Factors9  we  applie<1  our  framework  to  three  case  studies.  To  select  cases,  we 

identified  weapon-system  programs  that  had  used  major  simulations  of 
operational  effectiveness  in  support  of  acquisition  decisions.  We  did  this 
because  we  believe  that  the  most  useful  process  is  to  assess  the  credibil¬ 
ity  of  a  simulation  in  the  context  of  its  application  in  the  study  of  partic¬ 
ular  issues.  We  also  wanted,  however,  to  examine  general  purpose 
models  that  had  the  ability  to  simulate  several  types  of  weapon  systems. 

We  judgmentally  selected  two  Army  antiaircraft  defense  systems:  the 
portable,  shoulder-fired,  infrared,  surface-to-air  Stinger  missile  and  the 
division  air  defense  gun  (divad,  known  also  as  the  “Sgt.  York”),  a  sur¬ 
face-to-air,  radar-guided  gun  on  a  tracked  vehicle.  For  these  two  weapon 
systems,  we  chose  three  simulations:  for  the  Stinger  missile,  we  chose 
the  como  HI  model,  and  for  the  divad,  we  chose  the  Carmonette  and  air 
defense  air-to-ground  engagement  (adage)  models.  We  describe  these 
weapon  systems  and  simulation  models  in  chapter  3.  (In  appendix  I,  we 
also  briefly  describe  how  simulations  were  used  in  studies  for  the  two 
weapon-system  programs.) 

We  obtained  general  descriptions  of  the  simulations  and  the  use  of  their 
results  in  the  acquisition  process.  We  also  reviewed  documentation 
explaining  how  these  simulations  were  developed  and  validated.  We 
interviewed  the  analysts  and  test  engineers  who  were  involved  in  devel¬ 
oping  and  using  the  simulations,  asking  for  their  perceptions  as  well  as 
documentation  pertinent  to  factors  in  our  framework.  We  also  inter¬ 
viewed  several  persons  responsible  for  the  maintenance  of  the  simula¬ 
tions  and  for  using  the  simulation  results.  We  interviewed  others  who 
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dealt  with  other  aspects  of  the  simulation  development  and  experts  in 
related  subjects,  such  as  operations  research,  combat  environments, 
threat  assessment,  and  field  tests. 

This  provided  us  with  information  about  the  alternative  theories, 
assumptions,  data,  and  procedures  that  were  used  in  developing,  run¬ 
ning,  and  reporting  the  simulations  we  reviewed.  Using  our  framework 
to  guide  our  analysis  of  these  data,  we  identified  strengths  and  weak¬ 
nesses  that  could  enhance  or  threaten  the  credibility  of  the  simulations. 
Our  summary  findings  for  the  three  case  studies  are  in  chapters  4,  5, 
and  6,  and  additional  detail  on  them  is  in  appendixes  II,  III,  and  IV. 


What  Effort  Has  DOD  To  address  our  third  question — What  effort  has  dod  made  to  foster  and 

Made  Toward  Credibility?  reinforce  the  credibility  of  its  simulations? — we  collected  and  reviewed 

information  about  dod  and  Army  regulations  and  policies  relevant  to 
simulation  development,  management,  and  assessment  generally  and  to 
the  simulations  we  reviewed  specifically.  We  also  interviewed  dod  offi¬ 
cials  responsible  for  managing  and  performing  simulations.  Our  find¬ 
ings,  presented  in  chapter  7,  provide  information  on  dod’s  mechanisms 
and  procedures  for  gaining  and  maintaining  the  credibility  of  its 
simulations. 


Our  Study’s  Strengths  and  We  examined  other  assessment  procedures  and  structures  and  based  our 
Limitations  framework  on  this  body  of  work,  but  we  found  few  examples  of  the 

application  of  other  frameworks.  We  were  able  to  use  our  framework 
with  several  Army  simulations.  Since  one  of  our  objectives  was  to 
demonstrate  the  feasibility  of  applying  our  framework,  it  was  not  neces¬ 
sary  nor  would  it  have  been  practical  to  review  all  or  even  a  large 
number  of  the  simulations  used  in  major  weapon-systems  acquisition 
programs.  The  complex  and  technical  nature  of  the  simulations  and  our 
14  factors  called  for  a  method  suited  to  in-depth  assessment.  The  case 
study  method  was  the  most  plausible  for  illustrating  the  application  of 
the  framework.  One  limitation  of  this  approach  is,  of  course,  that  it  pre¬ 
vents  us  from  generalizing  from  our  findings  regarding  the  credibility  of 
the  simulations  we  selected  to  any  other  simulations. 


The  Structure  of  This 
Report 


Our  findings  are  presented  in  chapters  2  and  4-7.  In  chapter  2,  we 
describe  concepts  others  have  used  in  assessing  simulations  and  the 
framework  we  developed.  In  chapter  3,  we  describe  the  weapon  systems 
and  the  simulations  in  our  three  case  studies.  This  provides  important 
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background  material  for  understanding  our  findings  in  the  three  subse¬ 
quent  chapters.  In  chapters  4-6,  we  address  the  three  major  areas  of 
concern  in  our  assessment  framework.  Table  1 . 1  shows  this  structure. 


Table  1.1:  The  Structure  of  This  Report 


Question 

Discussion 

1 .  What  factors  should  be  considered  in  a  systematic  attempt  to 
assess  the  credibility  of  a  simulation? 

Chapter  2 

2  What  are  the  results  of  an  assessment  of  selected  weapon- 
system  operational-effectiveness  simulations  with  respect  to  these 
factors7 

a  Background  data  on  the  3  case  studies 

Chapter  3 

b  The  credibility  of  a  model  based  on  theory,  model  design,  and 
input  data 

Chapter  4,  appendix  II 

c.  The  credibility  of  a  model  based  on  correspondence  between 
the  model  and  the  real  world 

Chapter  5,  appendix  III 

d  The  credibility  of  a  model  based  on  support  structure, 
documentation,  and  reporting 

Chapter  6,  appendix  IV 

3  What  efforts  has  DOD  made  to  foster  and  reinforce  the  credibility 
of  its  simulations7 

Chapter  7,  appendix  V 

In  chapter  4,  we  describe  the  importance  of  theory,  model  design,  and 
input  data  as  they  contribute  to  credibility,  and  we  discuss  the  applica¬ 
ble  factors  from  our  framework.  We  summarize  examples  from  our  anal¬ 
ysis  of  the  three  case  study  simulations  and  include  findings  that 
illustrate  their  strengths  and  limitations.  A  more  detailed  discussion  of 
these  findings  is  in  appendix  II.  We  do  the  same  in  chapter  5  and  appen¬ 
dix  III,  where  the  area  of  concern  is  the  correspondence  between  a 
model  and  the  real  world,  and  in  chapter  6  and  appendix  IV,  where  the 
area  of  concern  is  with  a  simulation’s  basic  support  structure,  documen¬ 
tation,  and  reporting.  In  chapter  7,  we  examine  the  policies,  regulations, 
and  structures  that  dod  and  the  Army  used  to  promote  the  credibility  of 
the  simulations  with  respect  to  their  design,  implementation,  and  man¬ 
agement.  Our  findings  are  summarized  in  chapter  8,  which  also  includes 
our  recommendations  to  dod.  Appendix  V  contains  comments  from  dod 
about  our  draft  report. 
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Prior  Research  Various  procedures  have  been  proposed  to  permit  reasoned  judgment 

concerning  the  credibility  of  simulation  results.  Several  analysts  have 
proposed  structures  for  what  are  variously  called  "assessments,"  "eval¬ 
uations,”  and  “appraisals."  While  terminology  and  structure  differ,  a 
number  of  common  themes  appear.  For  example,  S.  1.  Gass  in  1983  pro¬ 
posed  an  assessment  procedure  that  addresses  13  information  items.  ( 1 ) 
mathematical  and  logical  description,  (2)  model  documentation,  (3)  com¬ 
puter  program  documentation,  (4)  computer  program  consistency  and 
accuracy,  (5)  overall  computer  program  verification,  (6)  technical  valid¬ 
ity,  (7)  operational  validity.  (8)  dynamic  validity,  (9)  training,  ( 10)  dis¬ 
semination,  (11)  usability.  (12)  program  efficiency,  and  (13)  overall 
model  validation.1  In  1979,  we  described  5  criteria  necessary  for  evalu¬ 
ating  models:  ( 1 )  documentation,  (2)  validity,  (3)  computer  model  verifi¬ 
cation,  (4)  maintainability,  and  (5)  usability. 

T.  I.  Oren  in  1981  identified  six  components  for  systematically  assessing 
the  acceptability  of  a  simulation  study.  They  were  ( 1 )  data,  (2)  model. 
(3)  experimentation  specification,  (4)  computer  program,  (5)  methodol¬ 
ogy  and  technique,  and  (6)  simulation  results. 1  A  framework  is  pre¬ 
sented  that  allows  an  assessment  of  the  concepts  and  criteria  related  to 
the  acceptability  of  the  components. 

G.  L.  Harris's  3  items  for  gaining  and  maintaining  credibility  were  ( 1 ) 
model  qualification  (focused  on  the  simulated  phenomenon’s  representa¬ 
tion  in  theory  and  data),  (2)  computer  model  and  program  verification, 
and  (3)  general  validation  of  the  computer  model.4  Each  item,  in  turn, 
was  defined  with  a  detailed  procedural  checklist. 

Banks,  Gerstein,  and  Searles  developed  a  7-step  modeling  structure  that 
is  both  the  framework  for  creating  the  model  and  the  structure  for  per¬ 
forming  the  evaluation.  The  steps  within  the  structure  include  ( 1 )  sys¬ 
tem  feasibility,  (2)  requirements  definition,  (3)  preliminary  design,  (4) 


'S.  [.  Gass,  “Decision- Aiding  Models:  Validation.  Assessment,  and  Related  Issues  for  I’oliey  Analysis." 
Operations  Research.  314  (July-August  1983),  <>18. 

-  1  ’.S.  Genera]  Accounting  Offitv.  Guidelines  for  Model  evaluation  fxftosuro  liraft.  GAO  PAD-79- 1 7 
(Washington.  DC  :  .January  1979),  p  9 

*T.  I  Oren,  "Concepts  and  Criteria  to  Assess  Acceptability  of  Simulation  Studies:  A  frame  of  Refer 
ence,"  Communications  of  the  ACM.  24.4  ( 1981 ),  181 

4G.  L.  Harris,  Computer  Models,  laboratory  Simulators,  and  Test  Ranges:  Mooting  the  Challenge  of 
Estimating  Tactical  Force  Effectiveness  it.  the  1980  s  (fort  Joavenworth,  Kansas  1  S  ArmvCom 
mand  and  General  Staff  College,  1979),  p  vi 
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detailed  design,  (5)  coding,  (6)  testing,  and  (7)  operations  and  mainte¬ 
nance/  A  number  of  specific  procedures  and  evaluation  criteria  are 
identified  for  each  step. 

Although  the  emphases  may  differ,  the  purpose  of  each  assessment 
structure  is  to  guide  the  analyst  in  determining  a  simulation’s  credibil¬ 
ity.  We  used  several  structures  in  developing  our  framework.  Since 
probably  no  framework  can  be  exhaustive  and  also  practical,  we  sought 
to  highlight  the  most  critical  matters  for  determining  the  strengths  and 
weaknesses  of  a  simulation. 


Our  Framework 


To  assemble  the  factors  necessary  in  any  systematic  attempt  to  assess 
credibility,  we  looked  for  factors  that  research  and  experience  indicated 
should  be  linked  to  confidence.  We  found  three  major  areas  of  concern 
and  14  factors. 


Theory,  Model  Design,  and 
Input  Data 


The  first  area  of  concern  pertains  to  how  a  simulation  model  imitates  a 
weapon  and  its  environment.  Matters  of  interest  include  the  characteri¬ 
zation  of  the  weapon  system  and  its  operation  in  both  its  immediate 
environment  and  its  larger  combat  arena,  the  mathematical  representa¬ 
tion  of  the  real  world,  the  indicators  of  the  weapon's  effectiveness,  and 
the  data  for  initiating  the  simulation  and  providing  ongoing  input. 
Briefly,  the  concern  is  with  the  theory  that  underlies  the  simulation,  the 
design  of  the  model,  and  the  input  data.  These  basic  components  in  con¬ 
structing  a  simulation  determine  the  results  and  thereby  seriously  affect 
their  credibility.  We  represent  these  concepts  in  the  first  7  factors  in 
table  2.1. 


r,J.  Banks,  D  M.  Gerstein,  and  S  P.  Searles,  "The  Verification  and  Validation  of  Simulation  Models.  " 
School  of  Industrial  and  Systems  Engineering,  Georgia  Institute  of  Technology.  Atlanta,  Georgia. 
198<>,  pp.  5  and  28-118 
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Table  2.1:  A  Framework  for  Assessing 
the  Credibility  of  a  Simulation 


Area  of  concern 


Factor 


A  Theory,  model  design,  and 
input  data 


B  The  correspondence 
between  the  model  and  the 
real  world 


1  Match  between  the  theoretical  approach  of  the 
simulation  model  and  the  questions  posed 

2  Consideration  of  the  weapon  system's  important 
operational  measures  of  effectiveness 

3  Portrayal  of  the  immediate  environment  In  which  the 
weapon  will  be  used 

4  Representation  of  the  weapon  system's  operational 
performance 

5  Depiction  of  the  critical  aspects  of  the  broad-scale 
environment  of  the  battle 

6  Appropriateness  of  the  mathematical  and  logical 
representations  of  combat 

7  Selection  of  input  data 


8  Evidence  of  a  verification  effort 

9  Evidence  that  the  results  are  statistically  representative 
10  Evidence  of  sensitivity  testing 


1 1  Evidence  of  validation  of  results 

C  The  support  structures,  12  Establishment  of  support  structures  to  manage  the 
documentation,  and  reporting  simulation's  design,  data,  and  operating  requirements 

13  Development  of  documentation  to  support  the 

information  needs  of  persons  using  the  simulation  or  its 
results 


14  Disclosure  of  the  simulation's  strengths  and 
weaknesses  when  the  results  are  reported 


Credibility  as  indicated  by  these  7  factors  depends  partly  on  how  the 
simulation  is  intended  to  be  used  in  decisionmaking.  That  is,  it  derives  in 
part  from  the  match  between  the  simulation  model  and  the  purpose  of 
the  simulation.  If  critical  features  of  the  weapon  system,  its  environ¬ 
ment,  and  its  operation  in  combat  are  not  portrayed  appropriately  for 
the  purpose  of  the  simulation,  the  results  may  be  inaccurate  or 
irrelevant. 

For  example,  if  the  ability  of  a  missile’s  guidance  system  to  function 
properly  is  an  important  concern  to  decisionmakers,  then  a  model  using 
a  superficial  characterization  of  guidance  dynamics  probably  would  not 
be  suitable.  But  if  the  missile’s  guidance  system  is  just  a  small  part  of 
much  larger  concerns  about  what  happens  in  a  multiweapon  battle,  it 
may  be  possible  to  model  the  guidance  system  in  a  very  simple  way 
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without  damaging  the  credibility  of  the  results.  Several  of  the  first  7 
factors  focus  attention  on  the  match  between  the  model  and  the  pro¬ 
posed  use  of  the  simulation  and  its  results. 

Because  all  simulations  depend  heavily  on  judgment  in  selecting  model¬ 
ing  techniques,  identifying  functional  relationships,  choosing  scenarios, 
and  selecting  sources  of  input  data  in  representing  the  real  world,  it  is 
important  that  judgment  be  based  on  a  knowledge  of  military  opera¬ 
tions,  the  physics  of  weaponry,  the  behavior  of  military  personnel,  logis¬ 
tics,  and  the  results  from  tests  of  weapons  and  their  use  in  combat. 
Incomplete  knowledge  and  poor  judgment  may  fundamentally  distort 
the  results,  and  evidence  of  such  conditions  will  lessen  the  credibility  of 
a  simulation.  The  intent  of  several  of  the  7  factors  is  to  manifest  such 
evidence. 


A  Model  and  the  Real 
World 


Validation  includes  the  application  of  tests  to  the  simulation.  Although 
no  ultimate  test  or  test  sequence  confers  validity,  a  model  can  pass 
enough  appropriate  tests  so  that  qualified  researchers  would  say  that  it 
appears  to  be  valid  or  that  the  results  are  credible.  In  the  development 
and  implementation  of  a  simulation,  attention  must  be  given  to  the  pro¬ 
cedures  (such  as  tests  of  face  v:ilidity,  or  expert  reviews  of  the  model 
and  its  results)  that  will  increase  the  correspondence  between  the 
results  of  the  simulation  and  the  results  of  operational  testing,  combat 
operations,  and  other  simulations.  For  a  number  of  reasons,  such  as  lim 
ited  resources  and  data,  validity  checks  may  be  performed  rarely  or 
very  weakly.  Credibility  is  seriously  threatened  if  little  or  no  evidence 
demonstrates  that  results  correspond  closely  to  reality. 

A  related  but  narrower  idea  is  that  of  “verification,”  which  refers  to  the 
process  for  determining  that  a  computer-based  model  performs  as  the 
program  analysts  intend,  that  the  computer  programming  is  correct  and 


The  second  area  of  concern  is  the  correspondence  between  simulation 
outcomes  and  real-world  outcomes,  factors  8-11  in  table  2.1.  Of  foremost 
concern  in  this  context  is  the  idea  of  "model  validation,"  which  refers  to 
the  process  of  determining  the  agreement  between  the  real-world  system 
being  modeled  and  the  model  itself  and,  thus,  determining  whether  the 
model  is  an  accurate  representation  for  a  particular  application. 
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Support  Structures, 
Documentation,  and 
Reporting 


Summary 


internally  consistent."  The  lack  of  evidence  that  programming  errors 
have  been  sought  and  removed  lessens  the  credibility  of  the  model  and 
its  results,  even  when  the  theoretical  formulation  of  the  simulation  is 
considered  to  be  fundamentally  correct. 


The  third  area  of  concern  is  the  institutional  process  covering  practices 
such  as  configuration  management,  oversight  and  review,  and  documen¬ 
tation  and  reporting,  which  help  ensure  that  credible  simulations  are 
established  and  maintained.  Factors  12-14  in  table  2.1  deal  with  this 
area. 

Simulation  models  that  exist  independently  of  the  problems  they  can 
address  are  often  revised  in  order  to  correct  errors  or  omissions,  reflect 
current  information  about  systems  or  the  environment,  respond  to  spe¬ 
cific  modeling  needs,  and  operate  with  revised  computer  languages  and 
new  equipment.  An  organization  responsible  for  simulations  should 
have  an  established  process  for  changing  the  features  of  a  mode1,  cuch 
as  modifying  the  input  data,  the  computer  programs,  or  its  documenta¬ 
tion  and  copies. 

For  simulation  models  that  are  used  by  many  analysts  over  a  lorn* 
period  of  time,  modifications  not  centrally  approved  or  disseminated  can 
result  in  users’  not  knowing  what  features  are  and  are  not  included  in  a 
simulation.  Such  uncontrolled  changes  coupled  with  weak  documenta¬ 
tion  can  make  it  difficult  for  analysts  and  managers  to  understand  how 
the  results  were  derived.  Furthermore,  when  the  results  are  reported 
without  sufficient  detail  about  the  simulation’s  capabilities  and  limita¬ 
tions,  decisionmakers  may  risk  using  those  results  inappropriately. 

These  threats  to  credibility  undermine  the  user’s  ability  to  understand 
and  use  a  simulation. 


By  addressing  the  14  factors  in  our  framework  and  by  collecting  and 
reviewing  the  information  available  for  each  of  them,  we  believe  one  can 
identify  the  strengths  and  weaknesses  that  affect  the  credibility  of  a 
simulation.  We  did  not  attempt  to  weight  the  14  factors  for  their  relative 


"These  definitions  are  commoniy  used  in  the  operations  research  and  modeling  communities  and  they 
are  the  ones  most  often  found  in  DOD  documents.  A  few  scientists  define  verification  as  agreement 
with  reality  and  validation  as  the  investigation  of  internal  coasistency.  The  concept  of  simulation 
validity  is  sometimes  used  in  the  literature  to  refer  to  the  totality  of  a  review  framework.  As  we  use 
it,  however,  validation  refers  to  the  process  of  developing  confidence  in  the  simulation  results  by 
comparing  the  simulation  output  with  data  from  other  sources 
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importance  or  formulate  an  overall  rating  that  a  weighting  system 
would  produce.  We  believe  that  if  a  simulation  is  sound,  applying  our 
framework  to  it  will  reveal  its  soundness  and  reassure  decisionmakers 
about  using  the  results;  if  it  is  not  sound,  the  framework  will  indicate 
the  weaknesses. 

In  sum,  the  credibility  of  simulation  results  has  been  defined  in  terms  of 
how  much  confidence  one  has  that  a  simulation  closely  reflects  reality. 
We  have  argued  that  credibility  is  accumulated  from  three  kinds  of  evi¬ 
dence:  (1)  a  model  and  its  input  data  have  appropriately  portrayed  the 
important  features  of  the  weapon  system  being  simulated  and  its  envi¬ 
ronment,  (2)  the  model  produces  results  similar  to  results  from  the  real 
world,  and  (3)  the  procedures  followed  in  developing,  maintaining,  and 
using  the  model  tend  to  minimize  discrepancies  between  simulation 
results  and  real-world  results. 
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Early  in  our  review,  we  believed  it  necessary  to  assess  simulations 
within  the  context  of  their  application,  and  we  concluded  that  the  best 
way  to  select  candidate  simulations  for  our  case  studies  was  to  start 
with  the  weapon-system  programs  themselves.  That  is,  by  choosing  a 
weapon  system,  reviewing  its  history,  and  talking  with  knowledgeable 
persons  involved  with  it,  we  were  led  to  the  simulations  that  were  used 
for  it.  We  limited  ourselves  first  to  “major  systems” — systems  projected 
to  cost  at  least  $200  million  for  research,  development,  testing,  and  eval¬ 
uation  or  $1  billion  for  producton.  Then  we  imposed  further  condi¬ 
tions — a  system’s  proximity  to  the  full-scale  production  decision;  the 
use  of  simulations  in  its  research,  development,  testing,  and  evaluation; 
the  existence  of  a  body  of  empirical  data;  and  its  employment  or  control 
by  low-level  tactical  units  for  which  data  were  available.  This  led  us  to 
select  the  divad  and  the  Stinger  as  especially  suitable  weapon  systems. 


The  Weapons 


The  air  defense  mission  is  to  nullify  or  reduce  the  effectiveness  of  attack 
or  surveillance  by  hostile  aircraft  or  missiles  after  they  are  airborne, 
thereby  supporting  the  fundamental  Army  function  of  conducting 
prompt  and  sustained  land  warfare  operations.  Protecting  critical  opera¬ 
tional  and  strategic  assets  from  enemy  aircraft  is  a  primary  part  of  the 
mission;  the  attrition  of  enemy  aircraft  is  secondary.  Short-range  air 
defense  artillery  units  engage  enemy  close-air-support  helicopters  and 
fixed-wing  aircraft,  and  when  there  is  high-intensity  conflict  with 
enemy  ground  forces,  engage  ground  targets  in  self-defense. 


The  DIVAD  The  divad  was  developed  to  replace  the  Vulcan  air  defense  system, 

which  was  perceived  as  no  longer  able  to  defeat  attack  aircraft  or 
armored  assault  helicopters.  In  addition  to  filling  this  void  in  the  for¬ 
ward  battle  area,  the  divad  was  to  engage  lightly  armored  vehicles, 
trucks,  and  personnel.  The  system  was  operated  by  a  three-member 
crew. 

The  divad’s  turret  and  other  components,  such  as  the  prime  power  unit, 
were  mounted  on  an  M48A5  tank  chassis,  and,  overall,  the  divad  closely 
resembled  a  tank.  However,  when  its  prominent  radar  antennae  were 
extended,  the  system’s  height  was  15  feet.  The  Ml  tank’s  height,  in  com¬ 
parison,  is  8  feet.  The  divad’s  major  subsystems  were  the  tank  chassis; 
the  turret,  which  contained  most  o*  the  system's  electronic  equipment; 
and  the  radar,  which  was  derived  from  the  F-16  aircraft's  radar.  The 
radar  was  backed  up  by  a  fully  integrated  electro-optical  sighting  and 
ranging  system  consisting  of  a  laser  range  finder  and  optical  day  sights 
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Its  primary  armaments  were  twin  40-mm  Bofors  L70  guns  that  could  be 
fired  automatically  or  semiautomatically,  either  singly  or  in  pairs.  The 
ammunition  for  the  system  consisted  of  proximity-fused,  point-detonat¬ 
ing,  and  target-practice  rounds.  The  system  also  had  a  7.62-mm  machine 
gun  mounted  on  a  pedestal  next  to  the  squad  leader’s  hatch. 

The  request  for  proposals  for  engineering  development  for  the  divad  was 
issued  in  April  1977,  and  engineering  development  contracts  were 
awarded  to  Ford  Aerospace  and  Communications  Corporation  and  Gen¬ 
eral  Dynamics  Corporation  in  January  1978.  After  development  and 
operational  testing  of  the  prototypes,  Ford  was  awarded  a  fixed-price 
incentive  contract  to  complete  engineering  development  in  May  1981.  In 
May  1982,  the  divad  passed  its  program  review,  and  the  production  of 
50  systems  was  authorized.  In  May  1983,  an  additional  96  systems  were 
authorized,  and  additional  testing  and  evaluation  followed.  The  divad 
weapon-system  program  was  cancelled  in  August  1985. 


The  Stinger  The  Stinger  is  a  passive,  shoulder-fired,  infrared-seeking,  guided  missile 

with  an  antiaircraft,  air  defense  mission  to  fulfill  Army,  Marine  Corps, 
and  Air  Force  requirements.  The  34.5-pound  weapon  system  consists  of 
a  missile  in  a  launch  tube  and  a  reusable  gripstock  containing  the  firing 
circuits  and  identification-friend-or-foe  (iff)  electronics.  Both  the  gunner 
and  crew  chief  may  acquire  the  target  and  fire  the  weapon,  although  the 
crew  chief  generally  fires  only  when  the  gunner  is  engaged  with  another 
target.  Acquiring  a  target  includes  an  interrogation  with  the  integral  IKK 
system.  If  the  target  proves  hostile,  the  missile  is  launched  to  intercept 
and  destroy  it.  After  the  missile  has  been  launched,  the  crew  member  is 
free  to  engage  another  target,  take  cover,  or  move  to  another  location. 

The  Stinger’s  mission  is  to  provide  air  defense  support  in  forward  battle 
areas  and  to  high-priority  resources  throughout  the  divisional  areas  of 
operation.  The  Stinger’s  concept  definition  began  in  1968  in  response  to 
combat  deficiencies  in  the  Redeye.  The  system’s  design  was  completed 
by  December  1972.  In  April  1978,  full-scale  production  began,  and  initial 
operational  capability  was  achieved  in  February  1981.  In  June  1977. 
however,  the  Army  had  begun  the  engineering  development  of  an 
improved  version,  known  as  the  Stinger-TOST,  whose  full-scale  produc¬ 
tion  began  in  July  1985.  Another  improved  version,  w  ith  a  reprogram¬ 
mable  microprocessor,  began  development  in  September  1984. 

The  Stinger  is  used  throughout  the  battle  area.  In  the  rear,  it  is  used  as  a 
point  air  defense  weapon  for  high-value  resources,  and  in  the  forward 
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area  it  is  used  against  high-speed,  low-level,  ground-attack  aircraft  and 
helicopters.  Additional  capabilities  are  being  designed  so  that  it  can  be 
used  at  night,  as  an  air-to-air  missile  for  helicopter  use,  and  in  a  new 
lightweight  air  defense  system.  In  1984,  the  inventory  requirement  for 
the  Stinger  was  more  than  60,000  missiles  for  the  Army,  Marine  Corps, 
and  Air  Force. 


The  Simulations 


Within  the  research,  development,  testing,  and  evaluation  programs  for 
the  divad  and  Stinger  weapon  systems,  we  found  a  number  of  simula¬ 
tions  used  to  answer  a  variety  of  questions  pertaining  to  the  systems’ 
concepts,  engineering  design  and  performance,  costs,  and  operational 
effectiveness.  The  air  defense  air-to-ground  engagement  simulation 
( adage — consisting  of  two  "submodels,”  called  “Incursion"  and  “Cam¬ 
paign”)  and  the  Carmonette  simulation,  both  used  in  the  divad's  acquisi¬ 
tion  program,  and  the  COMO  111  air  defense  combat  simulation,  used  for 
the  Stinger's  program  analyses,  were  concerned  with  operational  effec¬ 
tiveness;  we  focused  on  this  because  it  is  of  interest  to  decisionmakers. 
These  three  simulations  varied  in  a  number  of  key  features,  including 
the  type  of  simulation  model,  the  treatment  of  uncertainty,  size  and 
duration  of  battle,  attrition  calculations,  the  coverage  of  air-to-ground 
interaction  and  ground  battle,  the  coverage  of  resupply,  and  computer 
running  time.  These  features  are  summarized  in  table  3.1 . 


Table  3.1:  The  Key  Features  of  the  ADAGE,  Carmonette,  and  COMO  III  Simulation  Models 


Feature 

ADAGE  Incursion 

ADAGE  Campaign 

Carmonette 

COMO  III 

Model  type 

Functional 

Functional 

Combined  arms 

Functional 

Treatment  of  uncertainty 

Monte  Carlo 

Expect  3d  value  or 
deterministic 

Monte  Carlo 

Monte  Carlo 

Size  of  battle 

Division 

Division 

Battalion 

All  levels  up  to  theater 

Length  of  battle 

Not  applicable 

Several  days 

Short,  intense  firefights 
about  25  minutes 

Short  battles  up  to  2  hours 

Attrition  calculation 

One  on  one  models  of 
each  air  defense  weapon 
type  against  each  target 
type 

Probabilities  developed  in 
Incursion 

Monte  Carlo  models  of 
specific  events  using  one- 
on-one  data 

Monte  Carlo  models  of 
specific  events  using  one 
on-one  data 

Treatment  of  time 

Sequenced  by  time 

Calculated  at  end  of 
mission 

Sequenced  by  event 

Sequenced  by  event 

Air  to  ground  interaction 

None 

Played 

Played 

Played 

Ground  battle 

None 

Played  using  data  outside 
the  model 

Played 

None 

Resupply 

Not  applicable 

Played 

None 

None 

Computer  time 

Short 

Short 

Long 

Long 
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Most  of  the  features  are  self-explanatory  or  are  covered  in  detail  in  later 
chapters  and  appendixes  in  this  report.  A  few  are  described  here.  Func¬ 
tional  models  study  a  particular  military  function,  such  as  air  defense, 
whereas  combined-arms  models  evaluate  alternative  combinations  of 
combat  forces,  such  as  alternative  combinations  of  armor,  infantry, 
artillery,  and  air  support  for  a  given  level  of  battle. 

In  the  treatment  of  uncertainty  by  Monte  Carlo  modeling,  important 
real-world  parameters  are  described  by  means  of  probability  distribu¬ 
tions.  A  very  large  number  of  random  inputs  is  sampled  from  those  dis¬ 
tributions  and  the  simulation  result  itself  is  expressed  as  a  distribution. 
In  contrast,  in  the  expected-value  (or  deterministic)  approach,  mathe¬ 
matical  expectations,  generally  the  mean  of  a  distribution,  summarize 
the  random  variables  that  describe  real-world  conditions.  Such  a  model 
is  deterministic  because  the  result  it  produces  is  certain  to  follow  from 
the  initial  conditions. 


ADAGE  The  adagk  model  is  a  functional  simulation  used  to  study  the  relative 

effectiveness  of  combinations  of  air  defense  weapons  in  a  division.  The 
Incursion  submodel  uses  the  Monte  Carlo  methodology  to  model  the 
attrition  of  a  single-threat  aircraft  from  a  single  ground-based  weapon. 
The  Campaign  submodel  then  uses  these  engagement  attrition  data  from 
the  Incursion  submodel  to  calculate  expected  value  results  for  a  specific 
scenario  of  many  weapons  and  targets. 

The  adage  Incursion  simulates  detection,  threat  reaction,  the  masking  of 
the  threat  aircraft,  reloading,  and  weapon-to-target  interactions.  The 
adage  Campaign  simulates  small  ra>ds  by  enemy  aircraft  attacking  divi¬ 
sion  ground  targets  over  a  span  of  several  days.  In  the  Campaign  sub¬ 
model,  the  number  of  air  defense  weapons  and  other  ground  weapons 
destroyed  is  based  on  an  expected  value  derived  from  the  number  of 
attacking  aircraft,  the  type  of  ordnance,  and  the  type  of  target.  Meas¬ 
ures  of  effectiveness  include  the  number  of  threat  aircraft  destroyed, 
the  number  of  air  defense  and  other  ground  weapons  remaining  and  the 
number  destroyed,  the  amount  of  air  defense  ammunition  used,  and  the 
number  of  friendly  aircraft  remaining. 

The  adage  was  developed  by  the  U.S.  Army  Materiel  Systems  Analysis 
Activity  specifically  to  study  the  imvad.  It  was  used  first  for  the  division 
air  defense  cost-and-operational-effectiveness  analysis  conducted  in 
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1977.'  It  was  also  used  for  the  1984  update  of  this  analysis  and  earlier  in 
1979,  for  the  short-range,  air  defense,  portable  force  structure  analysis 
and  in  1985  for  the  divad  comparative  analysis  The  adage  has  been  used 
for  other  air  defense  studies  as  well. 


Carmonette  Designed  about  30  years  ago,  the  Carmonette  is  a  combined-arms  combat 

model  that  simulates  small-unit,  ground  combat  involving  the  actions  of 
individual  soldiers  and  weapons.  Analysts  design  small-unit  engage¬ 
ments  to  examine  specific  questions  such  as,  “In  a  battalion  assault, 
what  are  the  trade-offs  between  armor,  infantry,  and  artillery?”  The 
Carmonette  includes  all  combined  arms:  infantry,  mounted  or  dis¬ 
mounted;  artillery,  including  air  defense  artillery,  and  mortars;  and 
armored  vehicles  and  helicopters.  Even  though  the  Carmonette  was 
designed  to  simulate  weapon-to-weapon  duels,  its  proper  use  is  for 
larger  engagements  of  combined-arms  actions  in  which  weapon-to- 
weapon  data  are  used  as  input.  The  focus  of  the  Carmonette  is  the  bat¬ 
tle,  not  individual  weapon  systems.  The  Carmonette  assumes  an  intense 
25-minute  battalion  task  force  battle. 

The  Carmonette  has  been  used  extensively  to  model  ground  warfare. 

The  U.S.  Army  Training  and  Doctrine  Command  has  characterized  it  as 
an  operational-effectiveness  model  in  which  the  various  systems  on  the 
battlefield  are  related  in  a  way  that  allows  for  an  investigation  of  their 
synergism.  In  addition  to  its  ground  warfare  applications,  the 
Carmonette  was  used  in  the  1984  and  1985  analyses  of  the  divad  and  in 
advanced-attack  helicopter  and  antihelicopter  studies. 


COMO  III  The  como  III,  used  primarily  for  studies  of  tactical  air  defense  effective¬ 

ness,  is  a  Monte  Carlo,  functional  simulation  in  which  particular  sub¬ 
models  are  combined  to  simulate  a  specific  air  defense  environment. 
Weapon-system  submodels  include  specific  ground-based  air  defense 
and  threat  aircraft,  and  other  submodels  simulate  functions  such  as 
communications  and  jamming. 

The  scale  of  battle  can  range  from  individual  battles  to  a  division  to  the 
theater.  Time,  in  the  range  of  2  hours,  generally  represents  a  period 


'This  type  of  analysis  is  a  comparative  evaluation  of  alternative  systems,  their  contribution  to  the 
force,  and  their  costs  in  personnel  and  funds  Its  purpose  is  to  assist  in  the  selection  of  a  preferred 
course  of  action  to  meet  a  stated  Army  need  It  is  conducted  prior  to  each  acquisition  milestone  deei- 
sion  for  major  systems  and  other  systems  designated  by  the  Army  Among  its  many  subanalyses,  the 
analysis  of  effectiveness  is  usually  the  most  controversial 
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short  enough  that  logistic  support  is  not  an  issue.  It  is  a  standard  Army 
model  for  tactical  air  defense  artillery  effectiveness  studies. 

como  III  was  developed  in  1966  in  the  Netherlands  by  the  technical 
center  of  the  Supreme  Headquarters  of  the  Allied  Powers  in  Europe  as 
an  advance  over  an  earlier  model.  It  has  been  used  to  investigate  broad 
air  defense  concepts,  the  effectiveness  of  particular  weapon  systems, 
naval  task  force  air  defense,  and  the  air  defense  structure  of  the  War¬ 
saw  Pact  nations,  among  others.  It  was  used  to  evaluate  the  Stinger  in 
conjunction  with  other  air  defense  weapons  and  to  determine  the 
Stinger’s  support  requirements.  (The  como  III  simulation  report  we 
examined  was  entitled  the  “Stinger  Battery  Coolant  Unit  Usage  Study.") 
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In  this  chapter,  we  focus  on  the  first  area  of  concern  in  our  framework, 
the  simulation  model  and  its  underlying  theory,  model  design,  and  input 
data  and  the  7  factors  we  identified  for  it  in  chapter  2  (see  table  4.1). 
Information  about  how  the  adage,  Carmonette,  and  COMO  III  models  were 
used  in  effectiveness  analyses  of  the  divad  and  the  Stinger  may  be  found 
in  appendix  I,  and  a  more  detailed  discussion  of  the  findings  in  this 
chapter  appears  in  appendix  II. 


Table  4.1:  The  Seven  Factors  for  Theory, 
Design,  and  Data* 


Area  of  concern 

Factor 

Theory,  model  design,  and 
input  data 

1 .  Match  between  the  theoretical  approach  of  the  simulation 
model  and  the  questions  posed 

2.  Consideration  of  the  weapon  system's  important 
operational  measures  of  effectiveness 

3  Portrayal  of  the  immediate  environment  in  which  the 
weapon  will  be  used 

4  Representation  of  the  weapon  system's  operational 
performance 

5.  Depiction  of  the  critical  aspects  of  the  broad-scale 
environment  of  the  battle 

6  Appropriateness  of  the  mathematical  and  logical 
representations  of  combat 

7.  Selection  of  input  data 

aThe  two  remaining  areas  of  concern  and  7  other  factors  are  in  table  2.1 


The  Match  Between 
the  Theoretical 
Approach  and  the 
Questions  Posed 


A  simulation  quite  credible  in  the  abstract  may  not  meet  the  specific 
needs  of  its  user,  depending  on  the  model’s  theoretical  approach.  The 
purpose  may  have  been  to  create  an  engineering  model  to  determine  the 
optimal  design  of  a  weapon  relative  to  its  technical  requirements,  a 
functional  model  to  aid  in  selecting  the  most  effective  weapon  system 
from  alternative  systems  performing  the  same  functional  element  of 
combat,  or  a  combined-arms  model  to  compare  alternative  combinations 
of  complementary  weapon  systems  v'for  example,  air  defense  weapons, 
infantry,  helicopters,  and  tanks).  Table  4.2  summarizes  our  case  study 
assessment  of  this  factor. 
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Table  4.2:  The  Match  Between  Theory  and  Questions _ 

Weapon  Model  Strength  Limitation 

DIVAD  ADAGE  Functional  model  designed  for  DIVAD  and  other  Expected-value  approach;  probabilities 

air  defense  studies;  useful  for  comparing  developed  in  the  first  submodel;  incomplete 

alternative  air  defense  systems  consideration  of  the  random  factors  of  modern 

warfare  in  second  submodel 

Carmonette  Combined-arms  Monte  Carlo  model  for  broad  Emphasizes  ground  bailie;  not  well  suited  for 

questions  of  warfare,  treats  the  random  factors  studying  the  effectiveness  of  competing  air 
of  warfare  probabilistically  defense  systems,  air  defense  a  recent  add-on. 

especially  for  fixed-wing  aircraft,  not  focused  on 
individual  weapon  systems 

Stinger  COMO  III  Functional  Monte  Carlo  model  for  air  defense  Absence  of  ground  battle  modeling  suggests 

issues;  useful  for  comparing  alternative  air  that  simulation  of  air  defense  in  the  more 
defense  systems  forward  areas  may  be  missing  an  important 

element  of  realism 


A  functional  air  defense  model  was  a  reasonable  choice  for  studying  the 
divad’s  performance  in  comparison  with  other  air  defense  alternatives. 
The  adage  model  emphasizes  ground-based  air  defense  weapons  and 
otherwise  generally  focuses  on  how  changes  in  air  defense  capability 
can  change  outcomes  in  ground  and  air-to-air  battles. 

The  Carmonette  was  designed  to  answer  broad  trade-off  questions 
beyond  issues  of  air  defense.  As  a  combined-arms  model,  it  is  generally 
not  as  well  suited  to  answering  the  questions  about  air  defense  alterna¬ 
tives  that  were  posed  about  the  divad.  The  model  attempts  to  portray  an 
overall  ground  battle  with  limited  air  war  features  but  is  not  focused  on 
individual  weapon  systems. 

The  como  III  is  similar  to  the  adage  in  that  it  is  a  functional  model 
designed  specifically  to  study  air  defense  issues.  In  general,  the  como  III 
model  is  properly  matched  to  the  questions  asked  about  the  Stinger.  It 
was  based  on  a  standard  scenario  generated  by  the  U.S.  Army  Air 
Defense  Artillery  School. 


Operational  Measures 
of  Effectiveness 


If  the  measures  of  effectiveness  a  simulation  addresses  are  not  related 
to  the  weapon  system’s  mission,  conclusions  about  the  system's  per¬ 
formance  in  combat  may  not  be  credible,  even  if  the  simulation  is  sound 
in  other  respects.  The  first  mission  of  air  defense  systems  is  to  protect 
critical  resources  from  enemy  aircraft;  the  second  is  to  destroy  enemy 
aircraft.  Therefore,  we  looked  for  the  coverage  of  measures  of  effective¬ 
ness  reflecting  these  missions.  Table  4.3  summarizes  what  we  found. 
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Table  4.3:  Operational  Measures  of  Effectiveness 

Weapon 

Model 

Strength 

Limitation 

DIVAD 

ADAGE 

Emphasizes  the  protection  of  critical  assets  as 
well  as  giving  attrition  factors 

No  coverage  of  effects  of  aircraft  mission 
aborts,  the  effect  of  ground  losses  to  enemy  air 
attacks,  an  important  factor  in  measuring 
operational  effectiveness,  appear  excessive 

Carmonette 

Reports  mission  aborts  and  helicopter 
remaskings  caused  by  air  defense  artillery  and 
radar  warning 

Emphasizes  attrition  factors  with  little  coverage 
of  protection  of  critical  assets 

Stinger 

COMO  III 

Presents  wide  range  of  measures 

No  modeling  of  ground  battles  limits  capacity 
to  measure  protection  of  critical  assets; 
concentrates  on  attrition  factors 

Both  the  adagk  and  Carmonette  simulations  provide  for  the  protection 
of  critical  resources  to  some  degree;  the  former  emphasizes  it,  whereas 
the  latter  emphasizes  measures  of  aircraft  attrition.  Although  the  como 
III  concentrates  on  measures  of  both  attrition  and  weapon  usage,  it  is 
more  limited  in  its  ability  to  use  the  preservation  of  resources  as  a  prin¬ 
cipal  measure  of  effectiveness,  because  ground  war  is  not  simulated. 
This  threatens  the  credibility  of  the  results  of  this  simulation. 


The  Portrayal  of  the 

Immediate 

Environment 


In  looking  at  how  adequately  a  simulation  model  portrays  a  weapon  sys¬ 
tem  in  its  immediate  wartime  environment,  we  focused  on  five  attributes 
of  a  plausible  battle  scenario:  the  size  of  the  battle,  the  duration  of  the 
battle,  the  nature  and  behavior  of  enemy  targets,  the  deployment  and 
movement  of  the  weapon  being  evaluated,  and  the  terrain  over  which 
the  battle  might  take  place.  These  attributes  are  summarized  in  table 
4.4. 
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Table  4.4:  Portrayal  of  the  Immediate  Environment 

Weapon 

Model 

Attribute 

Strength 

Limitation 

D'VAD 

ADAGE 

Battle  size 

A  division  model  for  a  weapon  with 
division-level  responsibilities 

Battle  length 

Covers  up  to  30  days,  permitting  the 
measurement  of  the  cumulative 
effects  of  air  defense 

Target 

Covers  all  potential  targets,  including 
helicopter  and  fixed-wing  aircraft 

Covers  only  nonjinking  helicopters  and 
aircraft  with  fixed  flight  paths 

Deployment  and 
movement 

Deployment  of  ground  assets  is  static, 
movement  is  only  indirectly  modeled 

Terrain 

A  statistically  general  terrain  that  can 
be  generalized  to  many  areas 

A  statistically  general  terrain 
representing  no  real  '  terrain 

Carmonette 

Battle  size 

A  battalion  model  for  a  weapon  with 
division-ievel  responsibilities 

Battle  length 

A  25-45-minute  firefight  that  ignores  the 
cumulative  effects  of  air  defense 

Target 

Stresses  helicopters,  most  studies  did 
not  include  fixed-wing  aircraft 

Deployment  and 
movement 

A  fully  dynamic  model  capturing  the 
effects  of  movement  of  ground 

weapons 

Terrain 

A  digitized,  specific,  "real"  terrain 

A  digitized,  specific  terrain  that  cannot 
be  generalized  to  other  areas 

Stinger 

COMO  III 

Battle  size 

Covers  all  levels  up  to  brigade, 
capturing  the  full  range  of  air  defense 
responsibilities 

Portrayed  a  limited  environment 
because  a  larger  scenario  would  have 
been  too  intensive  a  use  of  computer 
resources 

Battle  length 

Covers  short  battles  up  to  several 
hours  ignoring  the  cumulative  effects 
of  an  defense 

Target 

Covers  the  engagement  ot  helicopters 
and  fixed  wing  aircraft 

Deployment  and 
movement 

Static  deployment  of  ground  assets 
movement  is  only  indirectly  modeled 

Terrain 

A  digitized  specific  "real  terrain 

A  digitized  specific  terrain  that  cannot 

be  generalized  to  other  areas 


The  evidence  indicates  that  the  adack  and  como  III  can  simulate  a 
weapon  system's  immediate  environment  across  these  attributes  with 
some  limitations.  Iktth  are  strong  in  characterizing  the  size  of  battle  and 
the  full  range  of  targets.  The  adac.k  simulates  longer  battles  but  is  lim¬ 
ited  by  its  uniform  and  static  deployment  of  weapons.  The  como  III  por¬ 
trays  a  shorter  battle  with  the  Stinger  weapons;  they  are  deployed 
realistically  but  do  not  move,  a  limitation  for  portable  systems  for  which 
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movement  provides  a  form  of  individual  defense  at  the  cost  of  decreased 
operability.  The  como  III  and  adage  use  different  approaches  to  portray 
terrain.  The  como  III  simulates  specific  terrain;  the  adage  uses  a  statisti¬ 
cal  portrayal.  Neither  is  obviously  superior  to  the  other. 

The  Carmonette  is  more  limited  in  its  ability  to  portray  the  immediate 
environment  than  the  adage  and  COMO  III.  The  battalion  size,  which  is 
small,  and  the  short  duration  of  the  battle  are  inappropriate  for  the 
divad  weapon,  and  the  lack  of  fixed-wing  aircraft  targets  for  most  of  the 
analyses  we  examined  resulted  in  an  incomplete  set  of  targets.  These 
limitations  were  partially  offset  by  the  Carmonette's  realistic  portrayal 
of  deployment,  movement,  and  terrain  but  nevertheless  threatened  its 
credibility. 


Operational 

Performance 


We  assessed  the  simulations  across  several  attributes  of  a  battle  with 
respect  to  the  weapon  systems'  operational  performance,  covering  both 
detection  and  engagement.  Four  attributes  pertained  to  the  simulation  of 
target  detection:  visual  detection;  factors  that  might  lessen  battlefield 
visibility;  command,  control,  and  communication,  including  the  problem 
of  distinguishing  between  friend  and  foe;  and,  for  the  divad.  radar 
detection. 

Both  the  adage  and  como  III  simulations  are  limited  in  the  way  they 
depict  the  detection  of  enemy  targets.  For  example,  the  adage  only  indi¬ 
rectly  addressed  the  confusing  elements  of  combat— battlefield 
obscurants;  command,  control,  and  communication;  and  IKK.  The  adage 
also  used  indirect  means  to  portray  radar  detection.  The  como  indirectly 
includes  battlefield  obscurants  and  omits  IKK.  Our  review  of  the 
Carmonette  simulation,  however,  indicates  its  ability  to  address  these 
more  directly,  although  the  features  of  the  Carmonette  that  permit  the 
simulation  of  ikk  and  command,  control,  and  communication  were  not 
used  in  the  divad  simulation.  Our  results  are  summarized  in  table  4.5. 
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Tabto  4.5:  Portrayal  of  Kay  Detection  Characteristics 

Weapon 

Model 

Attribute 

Strength 

Limitation 

DIVAD 

ADAGE 

Visual  detection 

A  visual  detection  submodel  used  in 
early  studies  covered  weapon's  full 
range 

Determined  only  in  the  first  submodel 
to  achieve  full  range  detection,  later 
studies  using  a  night  vision  and  electro 
optical  laboratory  model  had  to  use 
forward-looking  infrared  capabilities 
that  were  not  part  of  the  DIVAD 

Battlefield  obscurants 

Only  indirect  play  through  probability  of 
weapon  participating  in  air  battle  no 
night  play 

IFF  and  command 
control,  and 
communication 

Only  indirect  play  through  a  visual 
detection  submodel 

Radar  detection 

Covers  gun  s  full  range 

Only  indirect  play  through  input  data 
adjustments,  aircraft  do  not  react  to 
radar  warnings 

Carmonette 

Visual  detection 

Fully  dynamic  but  with  range  limits, 
later  studies  using  the  visual  detection 
submodel  for  detecting  fixed  wing 
aircraft  covered  DIVAD's  range  limits 

Used  forward-looking  infrared  in  a  night 
vision  and  electro-optical  laboratory 
model  to  detect  helicopters 

Battlefield  obscurants 

Covers  night  and  most  obscurants 

IFF  and  command, 
control,  and 
communication 

Model  capabilities  not  used 

Radar  detection 

Well  detailed,  early  weaknesses 
overcome 

Stinger 

COMO  III 

Visual  detection 

Limited  range,  using  look  up  tables  and 
the  same  search  procedures  for  fixed- 
wing  aircraft  and  helicopters 

Battlefield  obscurants 

Only  indirect  coverage,  using  degraded 
detection  probabilities 

IFF  and  command 
control,  and 
communication 

Not  modeled 

Radar  detection 

Not  applicable  to  Stinger 

Three  of  the  attributes  we  examined  pertained  to  a  weapon’s  engage¬ 
ment  of  a  target  after  detecting  it.  The  first  of  these  was  the  characteris¬ 
tics  of  the  weapon  system  such  as  technical  capability  and  operating 
modes.  The  second  pertained  to  if  and  how  an  enemy  target  is  actually 
engaged,  called  “engagement  procedures.”  For  example,  a  model  might 
or  might  not  include  the  engagement  of  an  enemy  aircraft  flying  past  the 
air  defense  weapon  en  route  to  another  target.  And,  finally,  we  looked  at 
whether  and  how  the  models  handle  raids  by  multiple  aircraft.  See  table 
4.6. 
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Table  4.6:  Portrayal  of  Key  Engagement  Characteristics 

Weapon 

Model 

Attribute 

Strength 

Limitation 

DIVAD 

ADAGE 

Weapon 

characteristics 

Coverage  of  technical  capabilities  and 
targets 

Engagement  rules  and 
procedures 

Description  of  weapon  and  how  it 
engages  different  types  of  aircraft; 
coverage  of  engagement  of  aircraft 
flying  by  or  attacking  defended 
targets 

No  play  of  duels 

Multiaircraft  raids 

Includes  raids 

Excludes  spatial  and  temporal 
saturation  effects 

Carmonette 

Weapon 

characteristics 

Corrected  for  erroneous  early 
descriptions 

Engagement  rules  and 
procedures 

Prioritizes  targets 

Ignores  aircraft  flying  past  defended 
targets 

Multiaircraft  raids 

Permits  selection  from  several  targets 

Stinger 

COMO  Hi 

Weapon 

characteristics 

Uses  separate  weapon  programs 
adaptable  to  studying  weapon 
modifications,  good  description  of 
weapon  characteristics 

Engagement  rules  and 
procedures 

Allows  player  to  select  from 
alternative  procedures;  different  firing 
doctrines  can  be  specified 

Multiaircraft  raids 

Saturation  can  be  demonstrated; 
good  vehicle  for  demonstrating 

Stinger  operations  in  conjunction  with 
other  air  defense  weapons 

The  evidence  indicates  that  all  three  models  portray  engagement  charac¬ 
teristics  in  considerable  detail;  como  III  has  perhaps  the  best  coverage. 
The  adage  simulation  was  clearly  limited  in  its  treatment  of  multiair¬ 
craft  raids,  which  did  not  adequately  account  for  how  the  raids  could 
saturate  the  defense;  and  the  Carmonette  model  tended  to  ignore  air¬ 
craft  passing  through  the  battle  area.  The  relative  strengths  of  these 
models  in  simulating  the  engagement  asoects  of  a  battle  contributed  to 
their  credibility. 


The  Broad-Scale  Battle 
Environment 


When  seeking  to  determine  the  effectiveness  of  a  weapon  system,  atten¬ 
tion  is  focused  on  the  particular  weapon,  but  other  features  of  a  battle 
must  also  be  taken  into  account.  Air  defense  usually  does  not  operate  in 
isolation,  and  other  aspects  of  an  ongoing  battle  may  affect  the  opera¬ 
tion  of  weapons  such  as  the  divad  and  Stinger.  In  assessing  these  air 
defense  simulations,  we  tried  to  take  account  of  the  bigger  picture  by 
looking  at  three  battle  attributes  that  we  labeled  the  air  war,  the  ground 
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Table  4.7:  Portrayal  of  Broad-Scale  Battle 

Weapon 

Model 

Attribute 

Strength 

Limitetion 

DIVAD 

ADAGE 

Air  war 

Notes  damage  from  fixed-wing 
aircraft,  plays  air-to-air  war 

Treats  saturation  attacks  inadequately 

Ground  war 

Uses  attrition  rates  generated  only 
outside  the  model 

Interaction 

Shows  the  relationship  of  air  and 
ground  wars 

Plays  air  and  ground  wars  not 
interactively  but  through  expected 
values 

Carmonette 

Air  war 

Fixed-wing  aircraft  not  modeled  in  early 
studies  and  modeled  only  indirectly  in 
later  studies 

Ground  war 

Fully  developed  ground  battle 

Interaction 

Fully  dynamic  interaction  for 
helicopters 

Uses  a  model  similar  to  the  ADAGE  for 
multiaircraft  raids  by  fixed-wing  aircraft 

Stinger 

COMO  III 

Air  war 

Detailed  model  of  air  war 

Excludes  fratricide  from  air  defense 
artillery 

Ground  war 

No  ground  war 

Interaction 

No  interaction  except  for  ground 
damage  inflicted  by  aircraft,  no  ground- 
war  damage  to  air  defense 

war,  and  the  interaction  between  the  two.  Evidence  of  the  three  simula¬ 
tions’  capabilities  is  summarized  in  table  4.7. 

Our  assessment  indicates  that  the  Carmonette  has  considerable  ability  in 
broad-scale  battle,  probably  more  than  either  the  adage  or  como  III, 
largely  because  of  its  fully  developed  simulation  of  the  ground  battle. 
However,  its  simulation  of  the  air  battle  limits  its  usefulness  for  air 
defense  analyses.  The  como  Ill's  lack  of  a  portrayal  of  the  ground  war  is 
a  serious  limitation  for  studying  the  full  range  of  air  defense  activities. 
The  adage  included  all  three  aspects  of  combat  but  the  realism  of  its 
portrayal  was  limited. 


Mathematical  and 
Logical 

Representations  of 
Combat 


Having  looked  at  the  extent  to  which  various  aspects  of  a  battle  are 
credibly  accounted  for  in  the  overall  design  of  the  simulations,  we 
looked  at  their  mathematical  and  logical  representations.  We  noted  only 
minor  problems  for  the  Monte  Carlo  models,  and  overall  the  mathemati¬ 
cal  and  logical  features  of  the  Carmonette  and  como  III  contributed  to 
the  credibility  of  their  results  (see  table  4.8). 
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Table  4.8:  Mathematical  and  Logical  Representations 
Weapon  Model  Strength 

DIVAD  ADAGE 


Carmonette  Simulates  specific  dynamic  interactions 

between  individual  air  defense  weapons  and 
helicopters 


Stinger  COMO  III  Simulates  specific  dynamic  interactions 

between  individual  air  defense  weapons  and 
targets 


Limitation 

Uses  expected  value  in  many-on-many 
engagements;  poorly  understood  parameter 
determines  the  probability  of  various  air 
defense  weapons  participating  in  battle; 
survivability  is  based  on  attrition  rates 
applicable  to  weapon  classes 

Early  problem  of  squaring  of  kill  probabilities; 
generation  of  only  one  set  of  random  numbers; 
fixed-wing  model  uses  basically  the  same 
approach  as  the  ADAGE  in  multiaircraft  raids; 
problems  external  to  the  model  in  issues  of 
experimental  design  and  adequate  number  of 
model  runs 

Same  as  the  Carmonette  with  regard  to 
experimental  design  and  number  of  model  runs 


The  events  of  a  battle  may  be  computed  and  expressed  as  expected  val¬ 
ues  or  they  may  be  computed  less  efficiently,  but  more  realistically,  by 
the  Monte  Carlo  technique.  The  two  procedures  may  not  produce  the 
same  results.  Each  method  may  provide  information  not  available  from 
the  other.  Our  main  concern  with  the  ADAGE  simulation  was  that  its  use 
of  “kill  probabilities”  based  on  the  interaction  of  a  single  weapon  and  a 
single  aircraft  neglects  the  complexities  of  multiple  aircraft  attacks  and 
could  lead  to  substantial  distortions  of  what  happens  in  the  real  world. 


The  Selection  of  Input 
Data 


The  results  of  a  simulation  are  dictated  in  large  part  by  the  data  that  an 
analyst  enters  into  the  computer:  missile  firing  rates,  target  damage 
probabilities,  information  about  the  terrain,  and  so  on.  If  the  input  data 
are  basically  inappropriate  or  problems  arise  from  tailoring  the  data 
before  they  are  used  in  the  model,  the  credibility  of  the  results  is  likely 
to  be  diminished.  In  our  assessment,  we  attempted  to  determine  the  data 
shortcomings  in  the  case  study  simulations. 


The  Carmonette  and  como  III  appeared  to  have  relatively  appropriate 
data.  In  the  earlier  analyses,  the  adage  and  Carmonette  modelers  dif¬ 
fered  in  the  selection  of  input  data  and  models  for  the  visual  detection 
of  approaching  aircraft.  In  the  later  compromise,  the  data  did  not  prop¬ 
erly  describe  the  divad’s  detection  capabilities.  The  adage  simulation  had 
the  most  serious  input  data  limitations,  because  some  of  its  data  were 
outdated  and  some  key  values  (such  as  air  damage  to  ground  targets) 
produced  results  t<x>  large  to  be  accepted  by  knowledgeable  military 


Page  37 


GAO/PEMD-88-3  Assessing  DOD  Simulations  for  Credibility 


Chapter  4 

Credibility  Based  on  Theory,  Model  Design, 
and  Input  Data 


officials.  Table  4.9  shows  that  all  three  models  had  some  limitations 
with  regard  to  this  factor. 


Table  4.9:  The  Selection  of  Input  Data 

Weapon 

Model 

Attribute 

Strength 

Limitation 

DIVAD 

ADAGE 

Data  source 

Uses  data  from  a  variety  ot 
recognized  sources 

Data  quality 

Visual  detection  data  cover  full  range 
of  gun  for  helicopters 

Visual  detection  and  terrain  data  are 
old:  nigh!  vision  and  electro-optical 
laboratory  data  are  inadequate  for  the 
DIVAD’s  ability  to  detect  aircraft  to  the 
full  range  of  the  gun 

Data  tailoring 

Description  of  weapons  in  Incursion 
submodel  is  an  integral  part  of  the 
model  and  not  addressed  through  a 
data  base 

Carmonette 

Data  source 

Uses  data  from  a  variety  of 
recognized  sources,  some  different 
from  the  ADAGE  s  sources 

Data  quality 

Uses  a  visual  detection  submodel 
from  the  ADAGE  for  fixed-wing 
aircraft,  early  problems  using  Soviet 
ZSU-23  to  model  the  DIVAD  were 
overcome 

Uses  night  vision  and  electro-optical 
laboratory  data  inaccurate  for  the 
DIVAD's  visual  detection  of  helicopters 

Stinger 

COMO  III 

Data  tailoring 

Data  source 

Uses  data  from  a  variety  of 
recognized  sources,  some  different 
from  the  ADAGE  and  Carmonette 
sources 

Data  tailored  extensively  to  meet  model 
requirements  could  affect  results 

Data  quality 

Enqineerinq  data  are  reasonably 
reliable 

Human-factors  data  are  not  as  reliable 
as  engineering  data 

Data  tailoring 

Straightforward  for  engineering  data 

Data  about  the  Stinger  team  s  reactions 

may  have  been  subject  to  greater 
adjustment  or  interpretation  than 
engineering  parameters 


Some  of  the  Carmonette’s  early  data  problems,  such  as  an  incorrect 
description  of  the  divad  gun,  were  corrected,  but  the  problems  with  dis¬ 
puted  visual-detection  data  remained,  and  disputes  concerning  these 
data  required  the  \dagk  modelers  to  change  their  detection  data.  The 
Carmonette  and  como  III  simulations  require  extensive  tailoring  of  data 
in  order  to  make  the  data  usable  in  the  modeL,  opening  the  possibility 
that  the  results  may  depend  as  much  on  the  judgment  of  the  staff  as  on 
the  operations  the  model  simulated. 
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Summary 


All  three  simulations  had  considerable  capability  with  regard  to  por¬ 
traying  weapons  engaging  targets  and  simulating  important  aspects  of 
measures  of  effectiveness.  In  almost  all  instances,  however,  the  simula¬ 
tions  we  studied  had  some  limitations.  We  believe  that  the  effort 
required  to  remove  some  of  the  limitations  we  found  might  be  relatively 
minor,  but  for  others,  much  more  work  would  be  required.  In  a  few 
instances,  fixing  the  model  might  not  be  the  appropriate  response;  using 
a  different  model  might  be  more  appropriate.  For  example,  our  assess¬ 
ment  indicates  that  the  Carmonette,  as  a  combined-arms  battalion-level 
model,  was  generally  not  as  well  suited  to  answering  the  original  ques¬ 
tions  posed  about  the  divad  as  an  air  defense  alternative,  so  that  modify¬ 
ing  the  model  is  probably  not  a  reasonable  solution  to  the  limitation. 
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In  this  chapter,  we  focus  on  factors  8-1 1  in  our  framework;  or  the  proce¬ 
dures  with  which  the  analysts  demonstrate  that  a  model  is  a  good  repre¬ 
sentation  of  reality  and  that  the  results  are  acceptable  surrogates  for 
results  that  might  be  collected  in  the  operation  of  a  weapon  system.  In 
table  5. 1 ,  the  factors  are  repeated  from  table  2. 1 . 


Correspondence  to  the  Real  World"  Area  0f  concern  Factor 

The  correspondence  between  the  model  and  8  Evidence  of  a  verification  effort 
the  real  world 

9  Evidence  that  the  results  are  statistically 
representative 

10  Evidence  of  sensitivity  testing 

11  Evidence  of  validation  of  results 
The  two  remaining  areas  ol  concern  and  10  other  factors  are  m  table  2  1 


While  analysts  can  never  provide  absolute  guarantees  about  the  credi¬ 
bility  of  a  model  or  its  accuracy,  they  should  be  able  to  provide  informa¬ 
tion  so  that  the  required  decisions  can  be  made  with  some  degree  of 
confidence.  They  can  produce  evidence  that  ( 1 )  the  computer  program 
operates  as  the  simulation  model's  designers  intended,  (2)  the  output  of 
the  simulation  represents  the  model’s  average  output  over  many  runs, 
(8)  the  results  take  into  account  sensitive  parameters  and  alternative 
scenarios,  and  (4)  a  model's  results  bear  sufficient  resemblance'  to  real- 
world  results  or  results  from  other  models  or  methods.  In  reviewing  the 
simulations,  we  paid  some  attention  to  the  use  of  como  Ml  with  weapon 
systems  other  than  the  Stinger,  because  the  information  contributed  to 
the  credibility  of  the  como  modeling  system.  (A  more  detailed  discussion 
of  our  findings  is  in  appendix  III.) 


Verification 


The  process  of  verification,  or  determining  that  the  computer  program¬ 
mer  has  translated  a  model  into  correct  computer  code,  may  be  per¬ 
formed  as  part  of  the  programming  and  checkout  phases  of  a 
simulation's  development.  These  phases  are  often  not  documented;  that 
is,  they  may  be  performed,  but  the  history  of  the  performance  is  usually 
not  recorded.  Consequently,  it  is  often  difficult  to  find  written  evidence 
of  verification. 


In  our  case  st  udies,  no  doc  umentary  evidence  of  verification  was  availa¬ 
ble  for  either  the  adagk  or  the  Carmonette.  lxn>  personnel  involved  with 
the  adagk  informed  us  that  some  checks  of  the  computer  code  had  been 
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made  and  problems  had  been  found  and  corrected.  The  Carmonette  ana¬ 
lysts  reported  that  some  peer  review’  had  been  performed.  We  were 
unable  to  document  any  verification  of  the  como  III  Stinger  model  or  the 
variant  that  was  developed  for  the  Stinger’s  battery  coolant  unit  analy¬ 
sis.  (See  table  5.2.) 


Table  5.2:  Evidence  of  Verification _ 

Weapon  Model  Strength 

DIVAD  ADAGE  Analysts  commented  that  computer  code  was 

checked  and  errors  were  corrected 

Carmonette  Extensive  peer  review  was  reported 

Stinger  COMO  III  U  S  Army  Missile  Commind  staff  verity  and 

validate  contractors  simulations  as  a  standard 
procedure 


Limitation 

No  formal  efforts  documented 


No  formal  efforts  documented 
No  specific  verification  efforts  identified 


The  lack  we  found  of  documented  evidence  of  verification  presents  a 
clear  threat  to  the  credibility  of  the  three  simulations.  The  recollections 
of  some  analysts  have  some  value,  but  written  documentation  would  be 
preferable. 


Credibility  rises  as  a  model's  users  become  assured  that  its  statistically 
averaged  results  do  not  vary  widely  when  the  model  is  exercised  several 
times.  It  is  important  to  know  whether  the  results  of  one  or  a  few  runs 
reasonably  represent  the  values  that  would  be  developed  if  a  simulation 
were  operated  an  indefinite  number  of  times. 

The  Incursion  submodel  of  the  ADAGE',  using  the  Monte  Carlo  modeling 
technique,  uses  multiple  runs  to  determine  one-on-one  kill  probabilities 
that  are  then  used  in  the  Campaign  submodel.  Analysts  who  worked 
with  adage  informed  us  that  each  Incursion  scenario  had  been  run  500 
times  and  that  the  resultant  mean  was  w  ithin  1  or  2  percent  of  the  true 
mean  at  the  98-percent  confidence  level.  This  is  substantial  support  for 
the  simulation’s  credibility. 

Each  run  of  the  Carmonette,  however,  required  a  substantially  larger 
commitment  of  computer  resources.  Therefore,  the  analysts  used  a  lim¬ 
ited  number  of  replications,  generally  10  for  a  scenario.  Replications  of 
the  scenarios  brought  many  of  the  aggregated  results  to  within  10  per¬ 
cent  of  the  true  mean  at  the  85-percent  confidence  level.  Similar  levels 
of  confidence  were  not  achieved  for  individual  weapon  systems,  so  that 
questions  remain  as  to  whether  the  Carmonette’s  battalion-level  results 


Statistical 

Representation 
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can  be  extrapolated  to  the  division.  The  Carmonette’s  analysts  doubted 
that  they  would  be  able  to  improve  the  confidence  with  a  reasonable 
number  of  additional  replications. 

The  como  III  simulation  of  the  Stinger  made  only  one  run  for  each  scena¬ 
rio,  and  the  report  of  the  analysis  did  not  address  the  statistical  repre¬ 
sentativeness  of  the  results.  Thus,  we  do  not  know  whether  the  differing 
results  from  scenario  to  scenario  came  from  differences  in  the  scenarios 
or  random  variation  inherent  in  the  model.  The  extent  to  which  statisti¬ 
cal  representativeness  supports  or  threatens  credibility  is  quite  mixed 
across  the  simulations,  as  can  be  seen  in  table  5.3. 


Table  5.3:  Evidence  of  Statistical  Representation 

Weapon 

Model 

Strength 

Limitation 

D'VAD 

ADAGE 

Probability  of  kill  developed  with  multiple 
replications:  statistical  procedures  developed 
kill  probabilities  within  2  percent  of  true  mean 
at  98-percent  confidence  level 

Carmonette 

Multiple  runs  on  many  scenarios  provided 
confidence  in  results,  many  results  were  within 
10  percent  of  true  mean  at  85-percent 
confidence  level 

Either  model  variability  or  insufficient 
replications  prevented  development  of 
confidence  levels  for  some  results 

Stinger 

COMO  III 

No  evidence  of  testing  for  measures  of  the 
mean  and  variance  of  results  prior  to 

experimenting  with  alternative  scenarios, 
simulation  appeared  to  move  directly  to 
scenario  and  some  parameter  testing, 
information  on  confidence  in  results  was  not 
developed  because  there  was  only  one  run  per 
scenario 


The  large  number  of  replications  and  the  quality  of  results  in  the  adage 
simulation  enhance  its  credibility.  For  the  Carmonette,  the  analysts 
addressed  statistical  representativeness  but  with  only  limited  success. 
Thus  it  has  some  credibility  but  not  that  of  the  adage  simulation.  The 
como  III  simulation  appears  not  tv;  have  addressed  the  need  for  develop¬ 
ing  statistically  representative  values.  This  constitutes  a  threat  to  the 
credibility  of  the  simulation. 


Sensitivity  Testing 


It  is  important  to  know  how  sensitive  a  simulation's  results  are  to  errors 
or  fluctuations  in  the  values  of  its  input  parameters.  Some  parameters, 
such  as  the  detection  range  of  a  missile  system,  may  be  in  considerable 
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doubt;  others,  such  as  visibility,  may  simply  be  subject  to  wide  varia¬ 
tion.  If  a  model  is  especially  sensitive  to  a  parameter,  then  the  credibil¬ 
ity  of  the  results  will  be  lessened  if  the  estimate  of  the  parameter  is  in 
error.  Sensitivity  testing  helps  determine  whether  there  may  be  a 
problem. 

[  A  related  issue  is  that  the  effectiveness  of  a  weapon  system  may  vary 

,  substantially  as  the  combat  scenario  changes.  For  example,  a  surface-to- 

air  missile  system  may  be  effective  against  attack  aircraft  but  easily 
‘  defeated  if  jamming  is  used.  A  scenario  can  be  tested  by  running  a  simu- 

.  lation  model  under  a  wide  variety  of  realistic  battle  conditions  in  order 

to  obtain  a  broad  view  of  a  weapon’s  effectiveness.  This  may  be  viewed 
as  testing  the  sensitivity  of  the  simulation  results  to  variations  in  scena¬ 
rios.  Table  5.4  summarizes  the  extent  and  manner  in  which  the  adage. 
Carmonette,  and  como  III  were  tested  for  sensitivity  in  our  case  studies. 


Table  5.4:  Evidence  of  Testing  for  Sensitivity  to  Parameters  and  Alternative  Scenarios 


Weapon 

Model 

Test 

Strength 

Limitation 

DIVAD 

ADAGE 

Parameters 

In  detailed  analysis  of  four  major  parameters,  three 
were  found  to  have  a  major  effect  on  the  weapon  s 
effectiveness 

Scenarios 

Scenarios  investigated  weapons,  environment,  and 
alternative  threats 

Carmonette 

Parameters 

Investigated  in  scenario  tests;  some  scenario 
changes  were  slight  enough  to  be  equivalent  to 
parameter  changes 

Scenarios 

Investigating  many  scenarios  gave  insights  on 
relationships  between  visibility  and  the  weapon’s 
effectiveness 

Stinger 

COMO  III 

Parameters 

Visibility  parameter  tested 

Additional  runs  needed 

Scenarios 

Range  of  scenarios  tested 

Only  one  run  per  scenario 

According  to  the  adage  documentation,  including  the  comparative  analy¬ 
sis  and  cost  and  operational-effectiveness  reports,  the  adage  modelers 
tested  four  parameters  they  believed  could  cause  substantial  error  in 
conclusions  about  the  divad’s  effectiveness  if  the  parameters  were  in 
error.  They  experimented  with  scenarios  for  variations  in  threat  levels, 
environment,  and  the  use  of  other  air  defense  weapons,  thus  developing 
valuable  information  on  the  simulation’s  response. 

Extensive  experimentation  with  scenarios  was  also  performed  with  the 
Carmonette.  More  than  50  different  scenarios  were  examined  in  the  sim¬ 
ulations  presented  in  the  1984  and  1985  reports.  Many  involved  a  rntyor 
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change,  such  as  the  addition  or  deletion  of  a  type  of  weapon  system,  but 
some  were  relatively  minor  and  might  be  better  thought  of  as  sensitivity 
analyses  of  specific  parameters.  There  was  no  formal,  separate  parame¬ 
ter  testing  for  the  Carmonette,  although  there  is  evidence  that  such  test¬ 
ing  was  performed  on  earlier  versions  of  the  model  that  did  not  include 
the  divad  component.  Tests  of  alternative  scenarios  provided  important 
insights  on  the  effectiveness  of  both  the  divad  and  total  battalion 
defense  with  regard  to  visibility,  mode  of  operation,  and  current  versus 
mature  divad  capabilities. 

The  report  documenting  the  COMO  III  simulation  analysis  indicated  that 
sensitivity  testing  was  performed  for  visibility.  The  analysis  addressed 
1 1  scenarios  that  considered  a  broad  range  of  air  defense,  threat,  and 
visibility  conditions. 

Sensitivity  testing  can  contribute  directly  to  an  understanding  of  a 
model's  behavior  and  to  its  credibility,  and  it  did  so  for  all  three  we 
examined.  The  adagk  analysts  used  both  parameter  testing  and  experi¬ 
mentation  with  alternative  scenarios  to  examine  simulation  results.  The 
credibility  of  both  the  Carmonette  and  the  como  also  benefited  from  the 
use  of  parameter  tests  and  alternative  scenarios. 


Validation  of  Results 


Validation,  in  a  narrow  sense,  is  the  comparison  of  simulation  results  to 
results  from  other  methods,  such  as  operational  testing  and  evaluation 
or  historical  experience,  or  from  models  for  estimating  a  weapon’s  per¬ 
formance  that  are  believed  to  be  substantially  credible.  The  limited  evi¬ 
dence  from  our  case  studies  suggests  that  validation  is  not  planned  for 
or  conducted  routinely  but  is  more  likely  to  be  performed  when  a  dispar¬ 
ity  is  found  in  the  results  of  similar  models  or  between  the  model  and 
real  system  data.  Analysts  or  others  in  dod  may  then  request  a  resolu¬ 
tion  or  an  explanation.  Our  conclusions  about  validation  efforts  for  the 
simulations  we  studied  are  summarized  in  table  5.5. 
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Table  5.5:  Evidence  of  Validation 


Weapon 

Model 

Test 

Strength 

Limitation 

DIVAD 

ADAGE 

Other  models 

Two  major  comparisons  attempted  with  the 
Carmonette.  early  effort  was  thought  to  give 
good  correspondence,  but  comparison  after 
changes  was  unsuccessful 

No  validation  prior  to  Carmonette 
comparison 

Operations 

No  operational  tests  identified 

Carmonette 

Other  models 

Same  as  ADAGE 

No  validation  prior  to  ADAGE 
comparison 

Operations 

The  model  was  validated  but  not  with  the 
DIVAD.  against  a  tank  warfare  field 
experiment 

Stinger 

COMO  III 

Other  models 

The  model,  but  not  with  Stinger,  was 
compared  with  an  Air  Force  model,  with  a 
satisfactory  resolution  of  initial  differences 

Operations 

No  operational  tests  identified 

We  found  that  no  formal  validation  efforts  using  real-world,  divad  data 
were  performed  on  the  adack  or  Carmonette.  This  is  not  to  suggest,  how¬ 
ever,  that  there  was  no  attempt  at  validation.  The  Army  regarded  the 
use  of  the  Carmonette  to  model  the  divad  as  itself  a  validation  effort  for 
the  adage.  It  was  made  when  questions  arose  about  the  results  of  the 
divad's  effectiveness  as  shown  by  the  adage.  Its  results  differed  substan¬ 
tially  from  those  of  the  Carmonette  and  other  air  defense  models.  How¬ 
ever,  further  analyses  that  adjusted  the  models  for  consistency  in  inputs 
(for  example,  the  same  number  of  air-to-ground  munitions)  and  scena¬ 
rios  (for  example,  the  same  size  battle)  made  the  adage  results  reasona¬ 
bly  comparable  to  those  of  the  other  models.  Later  changes  in  the 
Carmonette  model,  however,  led  to  differences  in  the  adjusted  results 
wdth  a  cause  that  could  not  be  pinpointed. 

We  did  not  find  evidence  of  validation  specifically  for  the  Stinger  simu¬ 
lation.  We  did,  however,  find  evidence  of  an  effort  to  validate  the  como 
III  model  by  comparing  its  results  to  those  from  an  Air  Force  model 
called  SORTIE.  The  reasonable  agreement  of  results  when  simulating 
similar  conditions  suggests  that  model-to-model  validation  can  margin¬ 
ally  strengthen  credibility,  especially  when  comparisons  with  real-world 
data  are  lacking. 

Efforts  to  validate  the  adage  and  Carmonette  with  respect  to  the  divad 
were  limited  to  comparing  the  two  models  to  each  other  and,  to  a  limited 
extent,  to  other  models.  The  lack  of  validation  success  with  the  model- 
to-model  comparison  threatens  the  credibility  of  the  models.  With  no 
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direct  validation,  the  como  III  situation  was  similarly  weak.  Yet  the  com¬ 
parison  with  the  SORTIE  suggests  that  validation  should  be  attempted 
and  that  even  comparison  between  dissimilar  models  may  improve  a 
model’s  credibility. 


Summary 


Some  of  the  efforts  of  the  simulation  analysts  to  show  that  the  models 
we  examined  closely  represent  reality  were  very  limited.  Some  valida¬ 
tion  was  not  even  attempted.  In  general,  the  efforts  to  validate  simula¬ 
tion  results  by  direct  comparison  to  data  on  weapon  effectiveness 
derived  by  other  means  were  weak,  and  it  would  require  substantial 
work  to  increase  their  credibility.  Credibility  would  also  have  been 
helped  by  better  documentation  of  the  verification  of  the  computer  pro¬ 
gram  and  by  establishing  that  the  simulation  results  were  statistically 
representative.  Probably  the  strongest  contribution  to  credibility  came* 
from  efforts  to  test  the  parameters  of  models  and  to  run  the  models  with 
alternative  scenarios. 
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Many  simulation  models  have  a  long  lifetime.  They  are  created  and  mod¬ 
ified,  become  more  complicated,  and  are  sometimes  used  in  several  ver¬ 
sions.  Because  of  this,  simulation  models,  like  all  other  complex 
software,  must  be  supported  by  an  organization  that  documents  its  oper¬ 
ation  and  ensures  that  decisionmakers  understand  both  the  strengths 
and  limitations  of  the  model.  We  believe  that  this  will  not  create  credi¬ 
bility  where  the  underlying  theory,  computer  representation,  or  valida¬ 
tion  procedures  are  weak,  but  it  will  help  prospective  users  judge  the 
applicability  of  a  simulation  to  their  needs  and  will  add  further  credibil¬ 
ity  if  the  simulation  is  relatively  strong.  Table  6.1  shows  from  our  com¬ 
plete  framework  the  relevant  factors  that  we  address  in  this  chapter. 


Table  The  Three  Factors  for 

Structures,  Documentation,  and  Area  of  concern  Factor 

Reporting*  The  support  structures.  12  Establishment  of  support  structures  to  manage 

documentation  and  reporting  the  simulation's  design,  data,  and  operating 

requirements 

13  Development  of  documentation  to  support  the 
information  needs  of  persons  using  the  simulation 
or  its  results 


14  Disclosure  of  the  simulation's  strengths  and 
weaknesses  when  the  results  are  reported 


JThe  two  remaining  areas  of  concern  and  1 1  other  factors  are  in  table  2  1 


Support  Structures  for 
Design,  Data,  and 
Operations 


Looking  at  Army  actions  relating  to  the  adage,  Carmonette,  and  como  III, 
we  looked  for  evidence  that  support  structures  had  been  established  for 
controlling  the  three  models  and  evidence  that  any  resultant  organiza¬ 
tions  were  functioning  as  intended.  We  found  that  each  model  had  been 
assigned  to  a  formal  entity  for  management:  the  adage  to  the  U.S.  Army 
Materiel  Systems  Analysis  Activity,  the  Carmonette  to  the  U.S.  Army 
Training  and  Doctrine  Command  (tradoc)  Systems  Analysis  Activity, 
and  the  como  III  to  the  U.S.  Army  Missile  Command.  In  addition,  the 
Army  designated  the  deputy  chief  of  staff  for  doctrine  responsible  for 
ensuring  that  doctrine,  future  concepts,  and  threats  are  properly  por¬ 
trayed  in  the  models. 


Illustrating  one  type  of  support,  tradoc  plays  a  role  in  both  managing 
and  using  simulation  models.  Its  regulation  entitled  ‘‘Management: 
tradoc  Models”  (regulation  5-4,  August  20,  1982)  provides  guidance  on 
managing  the  models  under  its  control,  tradoc  designates  one  agency 
responsible  for  each  model — for  the  development  of  software  and  for 
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the  management  of  the  data  base  and  changes  in  a  model’s  configura¬ 
tion.  Although  others  may  use  the  model  and  may  even  make  changes 
for  their  own  needs,  the  alterations  are  controlled  in  that  the  nature  of 
the  model  must  not  be  changed,  the  changes  must  be  coordinated  with 
the  responsible  agency,  and  the  changed  model  must  not  be  shared  with 
a  third  agency. 

Several  other  groups  play  roles  in  controlling  the  models.  For  example, 
an  interagency  group  was  established  in  1980  to  exert  some  control  over 
the  como  Ill’s  configuration  and  documentation  and  the  development  of 
new  models.  In  1986,  a  como  model  resources  group  was  formally  con¬ 
vened,  again  with  the  aim  of  providing  some  control  over  the  model. 

In  an  effort  to  maintain  oversight  and  review  at  a  different  level,  tradoc 
establishes  study  advisory  groups  to  monitor  the  progress  of  individual 
studies  using  models  under  trade's  control.  For  example,  in  two  divad 
studies,  a  1984  cost  and  operational-effectiveness  update  and  a  1985 
comparative  analysis,  study  advisory  groups  played  active  roles  regard¬ 
ing  the  use  of  the  apagk  and  Carmonette. 

Another  kind  of  control  is  exerted  by  weapon-system  program  offices, 
which  sometimes  establish  working  groups  to  oversee  engineering  simu¬ 
lations.  For  example,  the  Stinger  program  office  appointed  working 
groups  to  define  the  validation  requirements  for  models  and  to  review 
and  approve  validation  data. 

We  looked  beyond  the  mere  establishment  of  a  support  structure  to  see 
if  the  organizations  we  identified  were  actively  managing  the  simulation 
models  and  the  associated  studies  of  weapon  systems.  Some  organiza¬ 
tions  have  had  a  long-term  relationship  with  a  particular  simulation — as 
the  como  model  management  board  has  had  with  como  Ill — and  others 
have  had  a  brief  but  intense  relationship,  such  as  the  study  advisory 
groups  that  have  the  authority  to  advise  on  the  use  of  a  specific  simula¬ 
tion  model,  the  input  data,  or  the  scenarios  in  an  analysis.  We  believe 
the  long-term  relationship  is  more  likely  to  lead  to  a  substantive  effect 
on  the  credibility  of  simulation  results.  Our  review  of  the  support  struc¬ 
tures  is  summarized  in  table  6.2. 
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Table  6.2:  Support  Structures  for  Design,  Data,  and  Operations 
Weapon  Model  Strength 

DIVAD  ADAGE  U  S  Army  Materiel  Systems  Analysis  Activity  is 

responsible  for  management,  study  advisory 
groups  oversee  and  review  specific  studies 


Carmonette  U  S  Army  Training  and  Doctrine  Command  is 

responsible  for  management,  study  advisory 
groups  oversee  and  review  specific  studies 

Stinger  COMO  III  U  S  Army  Missile  Command  is  responsible  for 

management,  COMO  model  management 
board  represents  users  from  various  agencies 
and  meets  periodically  to  guide  development 
configuration,  and  documentation;  a  COMO 
model  resources  group  was  also  established  to 
facilitate  greater  coordination  among  users 


Limitation 

U  S  Army  Air  Defense  Artillery  School  has  been 
considered  the  appropriate  manager  for  air 
defense  functional  models  such  as  the  ADAGE 
a  study  advisory  group  is  organized  for  a 
specific  study  and  does  not  focus  on  long  term 
configuration  control  of  models 

A  study  advisory  group  is  organized  for  a 
specific  study  and  does  not  focus  on  long-term 
configuration  control  of  models 

U  S  Army  Air  Defense  Artillery  School  has  been 
considered  the  appropriate  manager  for 
functional  models  such  as  the  COMO 


The  Army  seems  to  have  been  at  least  partially  successful  in  maintain¬ 
ing  simulation  models  and  controlling  their  development  and  use.  It 
assigned  formal  responsibilities  for  control  for  each  of  the  case  study 
models  and  involved  several  groups  within  the  Army  that  have  an  inter¬ 
est  in  the  development  of  specific  models.  The  present  structure  for 
managing  COMO  III  recognizes  the  different  interests  of  those  various 
groups  and  their  viewpoints  toward  simulation. 


Documentation  for 
Users 


Well-documented  simulation  models  inspire  confidence  that  the  models 
will  be  used  correctly  to  address  the  types  of  issues  for  which  they  were 
designed.  Conversely,  if  documentation  is  incomplete,  and  especially  if  a 
model  has  been  evolving  for  a  long  time,  we  are  concerned  that  a  model 
may  not  be  simulating  the  events  and  conditions  the  analysts  think  it  is. 
We  looked  for  evidence  of  clear  and  complete  documentation.  What  we 
found  is  summarized  in  table  (5.3. 


% 
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Table  6.3:  Documentation  for  Users 

Weapon 

Model 

Attribute 

Strength 

Limitation 

DIVAD 

ADAGE 

Completeness 

Original  documentation  is  complete 

Recent  changes  are  not  yet 
documented 

Adequacy 

No  ma|or  problems  reported,  the 
developer  and  user  communicate 
frequently 

Carmonette 

Completeness 

An  executive  summary  and  list  of 
input  variables  are  available 

Detailed  documentation  is  not  available 

Stinger 

COMO  III 

Adequacy 

Completeness 

Comprehensive  and  detailed 
programmer-user  manual  is  available; 
comparably  complete  documentation 
is  available  for  other  models  and  for 
the  overall  system 

Lack  of  documentation  was  reported  as 
a  problem  in  understanding  the  results 

Adequacy 

No  problems  reported  or  identified 

Basic  knowledge  of  COMO  is  required 
to  use  the  manual 

We  found  the  adage  relatively  well  documented,  at  least  through  Sep¬ 
tember  1978.  However,  the  cost  and  operational-effectiveness  update 
study  for  the  divad  required  substantial  changes  to  the  adage  that  were 
not  accounted  for  in  the  documentation. 

The  Carmonette  is  documented  relatively  poorly,  which  became  evident 
during  the  cost  and  operational-effectiveness  update  study,  when  ana¬ 
lysts  at  the  U.S.  Army  Air  Defense  Artillery  School  tried  to  reconcile 
disparities  in  the  results  produced  by  the  adage  and  Carmonette.  The 
analysts  expressed  doubt  about  being  able  to  reach  a  reasonable  under¬ 
standing  of  the  Carmonette  without  better  documentation.  The  chair¬ 
man  of  the  study  advisory  group  charged  with  overseeing  the  update 
also  expressed  concern  about  the  lack  of  documentation. 

The  como  series  of  models  has  extensive  documentation.  Documentation 
was  produced  in  the  late  1960’s  and  early  1970's  at  the  technical  center 
of  the  Supreme  Headquarters  of  the  Allied  Powers  in  Europe,  where  the 
como  was  developed.  Since  then,  much  of  the  documentation  has  been 
produced  by  or  for  the  Army  Missile  Command  as  part  of  the  process  of 
developing  and  validating  individual  weapon-system  models  and 
improving  the  como’s  program  structure. 

We  found  the  main  documentation  for  the  como  III  simulation  of  the 
Stinger  comprehensive  and  detailed.  Although  validation  documents 


Page  50 


GAO/PEMD-88-3  Assessing  ROD  Simulations  for  Credibility 


Chapter  6 

Credibility  Based  on  Support  Structures, 
Documentation,  and  Reporting 


were  not  available  for  the  Stinger,  they  had  been  produced  for  corre¬ 
sponding  como  simulations  of  the  Patriot  and  Hawk  missiles. 

In  sum,  the  como  III,  and  to  a  lesser  extent  the  adage,  has  documentation 
that  tends  to  strengthen  the  user's  confidence  in  the  credibility  of  the 
simulation.  The  considerable  lack  of  documentation  for  the  Carmonette 
detracts  from  the  confidence  that  a  user  might  have  in  its  credibility. 


Reports  of  Strength 
and  Weakness 


In  examining  reports  from  the  simulation  studies,  we  wanted  to  deter¬ 
mine  the  extent  to  which  the  simulations'  strengths  and  weaknesses 
were  discussed.  We  believe  that  the  candid  and  complete  discussion  of  a 
model  is  associated  with  a  positive  contribution  to  credibility. 

The  reports  we  examined  included  the  following.  For  the  adage,  we 
reviewed  the  reixirt  on  the  divad’s  1977  cost  and  operational-effective¬ 
ness  analysis  and  the  draft  reports  for  its  1984  update  and  the  1985 
comparative  analysis.  For  the  Carmonette,  we  reviewed  the  1 984  update 
on  the  cost  and  operational-effectiveness  analysis  and  the  1985  compar¬ 
ative  analysis.  For  the  como  III,  we  reviewed  the  Stinger  battery-coolant- 
unit  usage  report,  a  validation  report  for  the  Patriot  missile  studies,  and 
the  documentation  for  the  Stinger  model.  Our  observations  are  summa¬ 
rized  in  table  6.4. 


Table  6.4:  Disclosure  of  Results 
Weapon  Model 

DIVAD  ADAGE 


Carmonette 


Stinger  COMO  III 


Strength 

Explicitly  stated  objectives,  strengths,  and 
weaknesses  of  the  simulation  analyses,  the 
1977  cost  and  operational-effectiveness 
analysis  report  was  especially  comprehensive 

Included  major  modeling  limitations 


Included  details  about  the  model  and  its 
limitations;  report  on  validation  of  Patriot 
models  is  highly  detailed  reporting  of  strengths 
and  limitations 


Limitation 

The  1984  draft  update  report  and  the  1985  draft 
comparative  analysis  report  contained  less 
descripiion  of  underlying  assumptions;  the  later 
report  .ncluded  fewer  division-level  analyses 

Contained  cursory  description  of  theoretical 
bases  for  analyses;  did  not  address  how 
limitations  affected  results,  variability  of  results 
was  not  addressed,  some  recommendations 
not  supported  by  analyses 

Omitted  description  of  some  methodological 
and  .nodeling  weaknesses 


The  adage  reports  contained  explicit  statements  of  the  study’s  objec¬ 
tives  and  the  strengths  and  limitations  of  the  simulation.  The  1977 
report  provided  the  rationale  for  studying  air  defense  in  a  division  con¬ 
text  and  identified  the  major  measures  of  effectiveness.  It  explained  the 
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logic  of  the  simulation,  the  relationship  between  the  Incursion  and  Cam¬ 
paign  submodels,  and  the  manner  in  which  air-to-air  and  ground  battle 
results  are  integrated.  Although  the  implications  of  the  analysis  of 
ground  battle  damage  were  not  fully  discussed,  it  was,  on  the  whole,  an 
adequate  treatment  of  the  simulation's  strengths  and  limitations. 

The  1984  update,  which  was  issued  only  in  a  draft  version,  also  clearly 
specified  the  purpose  of  the  simulation.  It  did  not  cover  the  background 
information  as  intensively  as  the  1977  report,  but  it  did  address  changes 
to  the  apagk  model  after  1977,  and  it  contained  a  section  reconciling  the 
ADA  ok  results  with  the  results  produced  by  the  Carmonette  and  other 
traixk'  models.  The  analysis  of  alternative  air  defense  structures  stated 
the  assumptions  and  limitations  clearly.  Thus,  except  for  not  repeating 
the  underlying  assumptions,  this  report  also  contributed  to  the  credibil¬ 
ity  of  the  simulation. 

The  results  from  the  1985  comparative  analysis  (also  issued  in  draft 
only)  tended  to  concentrate  on  outcomes  pertaining  to  the  protection  of 
forward  combat  units  and  gave  less  attention  to  the  division  context.  A 
more  balanced  presentation  would  have  been  more  appropriate.  Several 
limitations  of  the  simulation  were  discussed  and  an  attempt  was  made  to 
identify  and  reconcile  inconsistencies  in  the  results  of  the  adagk  and 
Carmonette. 

The  Carmonette’s  1984  update  report  appeared  to  make  recommenda¬ 
tions  that  were  not  well  supported  by  the  simulation's  results,  and  little 
or  no  attention  was  given  to  the  theoretical  basis  of  the  analyses.  While 
some  of  the  model's  limitations  were  discussed,  the  authors  did  not 
address  how  they  might  have  affected  the  results.  There  was  substantial 
variance  in  the  results  of  the  runs,  yet  they  were  accepted  without  dis¬ 
cussion  of  the  effects  of  their  variance  or  instability.  The  1985  compara¬ 
tive  analysis  clearly  stated  the  purpose  of  analysis  and  some  of  the 
major  assumptions  and  limitations  of  the  model.  But  many  of  the  impor¬ 
tant  areas  not  discussed  in  the  1984  update  were  still  not  completely 
addressed,  and  the  analysis  was  again  based  on  a  small  number  of  repli¬ 
cations  and  unstable  results.  A  summary  statement  about  the  report 
identified  several  major  limitations  of  the  simulation  that,  in  our  opin¬ 
ion,  cast  substantial  doubt  on  the  ability  of  the  Carmonette  to  study  air 
defense  alternatives,  although  the  statement  itself  did  not  draw  such  a 
broad  conclusion. 

The  Stinger  battery-coolant-unit  usage  stud’  clearly  developed  the 
rationale  for  the  scenarios  and  identified  the  limitations  of  both  the 
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computer  and  the  model.  The  como  Ill  model  was  described  with  a  level 
of  detail  that  would  allow  an  analyst  to  examine  the  operation  of  the 
Stinger  submodel  in  substantial  detail.  However,  one  limitation  of  the 
report  was  the  implicit  assumption  that  the  submodel  for  another  air 
defense  weapon  being  simulated  within  the  como  III  was  sufficiently 
credible  and  accurate  that  the  overall  results  would  not  be  biased.  Given 
the  size  and  complexity  of  the  como  modeling  system,  however,  it  may 
not  be  reasonable  to  expect  that  an  analysis  of  a  particular  weapon-sys¬ 
tem  model  can  also  address  the  credibility  of  other  como  submodels  in 
detail.  A  second  limitation  was  the  lack  of  comment  regarding  the  fact 
that  only  one  replication  for  each  scenario  was  produced  and,  thus,  the 
unresolved  issue  of  statistical  representativeness  in  the  results. 

The  como  III  modeling  system  functions  with  submodels  that  represent 
specific  types  of  weapon  systems.  The  reporting  on  the  strengths  and 
limitations  of  some  of  these  submodels  was  complete  and  useful.  For 
example,  the  report  on  the  validation  of  the  high-resolution  Patriot  mis¬ 
sile  submodel  with  three  other  surface-to-air  submodels  within  como  III 
was  a  thorough  comparative  analysis  in  which  the  results  of  each  model 
were  developed  and  compared  for  a  wide  range  of  scenarios.  The  report 
compared  results  such  as  detection  time,  launch  time,  and  point  of  inter¬ 
cept  rather  than  just  presenting  aggregated  measures  of  aircraft  kills. 
Recommendations  were  made  for  improvements  to  the  models  that 
would  bring  the  results  to  greater  uniformity.  The  strengths  and  limita¬ 
tions  of  each  model  were  discussed,  giving  attention  to  the  structural 
and  logical  differences  in  design  that  often  accounted  for  differences  in 
the  results. 


Summary 


In  examining  evidence  about  support  structures,  documentation,  and  the 
reporting  of  simulation  results,  we  found  that  the  Army  has  established 
functioning  support  structures  for  simulation  activities.  We  believe  that 
although  these  structures  have  limitations,  they  contribute  to  the  credi¬ 
bility  of  the  simulation  results.  The  quality  of  the  documentation  of 
models  and  results  is  mixed.  The  simulations  of  the  adage  and  como 
were  made  at  least  moderately  more  credible  by  detailed  documentation. 
Inadequate  documentation  for  the  Carmonette  led  to  questions  about  its 
credibility.  Reporting  practices  could  be  improved,  but  the  explicit  treat¬ 
ment  of  strengths  and  weaknesses  did  contribute  to  the  credibility  of  all 
three  simulations. 
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Our  third  question — What  effort  has  tx)t»  made  to  foster  and  reinforce 
the  credibility  of  its  simulations — led  us  to  look  for  formal  guidance 
applicable  to  the  three  simulations  we  reviewed  and  to  tod’s  simulation 
activities  in  general.  Formal  guidance  for  controlling  the  quality  of  simu¬ 
lation  activities,  as  for  many  other  activities,  might  cover  (1)  initiation, 
(2)  development,  (3)  assessment  or  evaluation,  (4)  documentation,  (5) 
use,  and  (6)  maintenance  or  upkeep.  We  believe  that  the  guidance  would 
not  only  designate  the  persons  who  are  responsible  for  simulation  activi¬ 
ties  and  establish  management  requirements  but  also  describe  policies 
and  procedures  for  these  activities. 

We  asked  two  questions  about  formal  guidance  for  establishing  and 
maintaining  credible  simulations: 

•  To  what  extent  has  the  office  of  the  secretary  of  the  Department  of 
Defense  developed  regulations  or  other  general  guidance  that  addresses 
the  development  and  assessment  of  simulations,  even  if  it  is  not  about 
specific  models  or  simulations? 

•  To  what  extent  has  the  Army  or  its  organizations  provided  regulations 
or  guidance  on  development  and  assessment  for  organizations  that  pro¬ 
duce  simulations? 

Although  our  search  led  us  to  look  for  relevant  guidance  throughout 
tod,  we  did  not  comprehensively  review  all  related  guidance,  such  as 
guidance  in  information  resources  management,  automated  data 
processing,  studies  and  analysis,  and  testing  and  evaluation.  We  also 
limited  our  focus  to  the  guidance  found  in  our  review  of  the  three  Army 
air  defense  simulations;  Air  Force  and  Navy  guidance,  therefore,  is  not 
included. 


Guidance  From  the 
Office  of  the  Secretary 


We  found  no  formal  guidance  specifically  for  simulations  from  the  level 
of  the  secretary  of  the  department.  However,  we  did  find  related  regula¬ 
tions  from  the  secretary’s  office  that  could  be  applied  to  computer  simu¬ 
lations.  The  more  important  ones  are  summarized  below. 


The  need  for  information  and  the  use  of  analysis  to  support  weapon- 
system  acquisition  decisions  is  stated  in  tod  directives  5000.1  and 
5000.2.  These  direct  that  some  form  of  system-effectiveness  analysis,  in 
conjunction  with  analyses  of  costs  and  other  factors,  be  performed  to 
support  milestone  decisions.  Directive  5000.3,  on  testing  and  evaluation, 
states  that 
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"The  use  of  properly  validated  analysis,  modeling,  and  simulation  is  strongly 
encouraged,  especially  during  early  development  phases  to  assess  those  areas 
which,  because  of  safety  or  testing  capability  limitations,  cannot  be  directly 
observed  through  testing." 

While  these  directives  encourage  the  use  of  simulations  and  other  analy¬ 
ses,  they  do  not  give  guidance  on  prerequisites  for  sound  simulations, 
how  to  develop  them,  or  how  to  assure  their  credibility. 

Regulations  on  automated  data  processing  and  the  management  of  infor¬ 
mation  resources  may  be  partly  applicable,  because  simulations  are  run 
on  computers.  However,  directives  on  these  topics  focus  mostly  on 
input-output  processing  and  file  structure.  They  do  not  always  include 
other  topics  important  to  computer  simulations,  such  as  the  construction 
of  models,  the  treatment  of  assumptions  and  limitations,  and  the  verifi¬ 
cation  and  validation  of  models.  Guidance  on  automated  data  processing 
typically  focuses  more  on  the  processing  of  input  data  than  on  creating 
data  as  part  of  the  process.  While  rod's  directives  and  standards  in  this 
area  may  be  useful,  they  are  inadequate  to  guide  the  development  and 
maintenance  of  computer  simulations. 

One  example  of  guidance  related  to  simulations  is  that  dealing  with  the 
quality  of  computer  software.  The  issue  of  software  quality  is  not  new 
to  computer  programming,  and  since  the  1970’s  a  great  many  profes¬ 
sional  papers  have  been  published  on  various  aspects  of  software  qual¬ 
ity  and  reliability.  The  concept  of  “quality"  is  somewhat  elusive  and 
includes  a  number  of  factors  such  as  reliability,  portability,  usability, 
and  maintainability. 

One  of  dod’s  major  concerns  with  software  quality  began  with  the  soft¬ 
ware  used  in  weapon  systems  or  “mission-critical  computer  systems." 
For  example,  the  1978  Weapon  System  Software  Development 
addressed  a  number  of  issues  related  to  quality.1  Directive  5000.3, 
issued  in  1979  and  updated  in  1986,  also  includes  guidance  for  testing 
and  evaluating  the  software  as  well  as  hardware  components  of  defense 
systems.  In  1983,  a  report  to  the  office  of  the  secretary  about  software 
testing  and  evaluation  recommended  modifications  that  would 
strengthen  directive  5000.3  with  respect  to  mission-critical 
applications. - 


1 1 1.S.  Department  of  Defense,  Weapon  System  Software  Development.  MIDSTD-1H79  ( Navy  I  ( Wash 
ington,  D  C.:  1978  i 

-R.  A.  DeMillo  and  R. .1  Martin.  OSD/DDT&K  Software  Test  and  Evaluation  Projeet.  vol.  1,  Final 
Report  and  Recommendations  ( Atlanta:  Georgia  lastitute  of  Technology,  198.'!  I.  pp  1-2. 
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There  are  indications  that  dod’s  interest  in  the  evaluation  of  software  is 
being  extended  to  software  systems  in  general.  When  Weapon  System 
Software  Development  was  revised  in  1982,  the  draft  title  was  changed 
to  “Software  Development”  and  its  stated  purpose  was  to  establish  “uni¬ 
form  requirements  for  the  development  of  software  for  the  Department 
of  Defense,”  expanding  the  standard  to  a  much  broader  class  of  soft¬ 
ware.  The  1985  revision,  issued  as  dod-STD-2167,  is  entitled  “Defense 
System  Software  Development.”  Another  indication  of  this  broadening 
interest  is  the  April  1985  draft  entitled  “Software  Quality  Evaluation,” 
which 

"establishes  requirements  for  software  quality  evaluation  ...  to  be  performed  dur¬ 
ing  the  development  and  support  of  software  in  Mission-Critical  Computer  Systems 
(MCCS).  This  standard  may  also  be  applied  to  the  evaluation  of  software  in  non- 
MCCS 


Although  this  interest  in  the  quality  of  software  began  with  weapon  sys¬ 
tems,  it  may  be  generalized  to  all  computer  systems.  However,  among 
the  military  personnel  involved  with  simulations,  we  did  not  find  sub¬ 
stantial  interest  in  or  recognition  of  the  importance  of  a  systematic 
approach  for  addressing  software  quality.  Arguments  that  can  be  raised 
against  designing,  programming,  and  testing  software  to  satisfy  estab¬ 
lished  engineering  standards  of  quality  include  that  it  will  take  more 
time,  at  least  early  in  the  process;  it  will  be  more  costly;  and  it  is  not 
mandatory  for  applications  not  mission-critical.  These  arguments  may 
be  appropriate  for  some  simulations  that  are  small  and  have  a  short¬ 
term  or  limited  purpose.  But  the  results  of  simulations  that  have  a 
longer  term,  develop  a  community  of  users,  and  are  intensive  consumers 
of  computer  and  personnel  resources  may  influence  major  decisions  in 
acquisition,  allocation  of  forces,  or  operations.  The  cost  of  designing  and 
testing  the  quality  of  software  for  these  simulations  becomes  a  neces¬ 
sary  part  of  their  development. 


Army  Regulations  and 
Practices 


The  Army  has  issued  regulations  that  address  the  management  of  mod¬ 
els  in  the  context  of  its  models  improvement  program  and  in  the  man¬ 
agement  of  studies  and  analyses  that  include  modeling.  The  Army  has 
made  an  effort  to  develop  a  hierarchical  modeling  system  that  reflects 
the  guidance  of  the  Army’s  models  committee;  it  was  spelled  out  on 
August  15,  1983,  in  regulation  5-11,  the  most  detailed  Army  statement 


:IU.S  Department  of  Defense.  "Software  Quality  Evaluation."  draft  MIIy-STD-21t>8.  Washington.  D  C  . 
April  1985,  p.  1;  the  emphasis  is  ours 
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regarding  modeling  policy  and  practice  among  the  documents  that  we 
reviewed.  Its  guidance  is  specific  to  the  models  in  the  hierarchy  that  the 
Army  will  include  in  the  major  modeling  efforts  it  expects  its  many 
organizations  will  use  over  the  next  several  years. 

The  purpose  of  the  models  improvement  program  is  to  develop,  docu¬ 
ment,  and  implement  a  hierachical  family  of  combat  models  that  could 
be  used  to  evaluate  combat  capabilities  and  determine  resource  require¬ 
ments  through  an  integrated  system  of  models  of  theater,  corps,  divi¬ 
sion,  combined  arms,  and  support  task  force  operations.  The  program's 
management  is  specifically  directed  to  ensure  that  appropriate  technical 
procedures  are  used  in  software  development  and  application,  assign 
responsibility  for  the  control  of  the  model's  configurations,  and  identify 
and  assign  the  data  management  responsibilities. 

tradoc  provides  specific  guidance  on  managing  models  and  on  using  and 
reporting  on  simulations  that  are  part  of  studies,  tradoc’s  August  20, 
1982,  regulation  5-4,  entitled  “Management,  tradoc  Models,”  sets  forth 
the  manner  in  which  its  models  are  managed  to  ensure  that  high-quality, 
responsive  models  are  available  for  combat  development  and  training. 
tradoc’s  March  29,  1985,  regulation  1 1-8,  “Management:  Army  Pro¬ 
grams — Studies  Under  AR  5-5”  and  the  accompanying  pamphlet,  “Army 
Programs:  Studies  and  Analyses  Handbook,”  issued  on  July  19,  1985, 
provide  guidance  on  planning  and  conducting  studies  as  defined  in  the 
Army’s  “Management:  Army  Studies  and  Analyses”  (AR  5-5).4  The 
“handbook”  discusses  studies  from  inception  to  completion  in  considera¬ 
ble  detail  to  help  officers  perform  timely  and  high-quality  studies.  It 
includes  a  detailed  description  of  the  strengths  and  limitations  of  mod¬ 
els,  analytical  tools,  and  guidance  on  reporting. 

As  we  mentioned  in  chapter  6,  tradoc's  regulation  5-4  assigns  manage¬ 
ment-control  responsibilities  to  various  groups  but  does  not  set  out  pro¬ 
cedures  for  maintaining  models.  That  is,  it  does  not  describe  how  to 
systematically  and  routinely  evaluate,  coordinate,  approve,  or  disap¬ 
prove  models  or  how  to  implement  approved  changes.  Although  it  does 
not  establish  requirements  for  establishing  and  maintaining  the  basic 
configuration,  it  does  include  an  outline  of  key  attributes  to  be  covered 
when  describing  models  that  are  in  tradoc’s  inventory. 


4In  the  October  lb,  1981.  regulation  AR  5-5,  "Management:  Army  Studies  and  Analyses,"  the  Army 
took  a  broader  view,  prescribing  policies,  responsibilities,  and  procedures  for  improving  the  quality  of 
its  studies  and  analyst's.  In  addressing  a  much  broader  area,  this  regulation  contains  no  detailed  guid¬ 
ance  on  modeling  approaches 
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The  Army  has  established  various  groups  to  address  the  technical  and 
management  aspects  of  the  studies  and  the  modeling  process.  For  exam¬ 
ple,  tradoc  analysts  participating  in  the  1984  workshop  on  consistency 
in  tradoc’s  studies  addressed  process,  modeling,  doctrine,  scenario,  and 
“enemy  and  friendly  data.”  They  noted  problems  that  remained  in  areas 
already  covered  by  their  guidance  and  made  many  recommendations  for 
improving  the  quality  of  tradoc’s  simulations.  One  was  the  recommen¬ 
dation  that  the  configuration  of  models  be  controlled,  because  the  thor¬ 
ough  validation  and  verification  of  a  model  that  are  not  followed  by  a 
“benchmark  run”  and  reasonably  tight  configuration  control  allow  an 
unacceptable  risk  of  inconsistency.  They  noted  further  that  agencies 
studying  models  change  them  without  audit  and  without  documentation. 
They  suggested  that  although  configuration  control  is  expensive,  it 
might  be  placed  in  a  body  meeting  periodically  or  as  needed  or  might 
consist  of  the  requirement  that  a  change  be  provided  to  its  proponents, 
the  Combined  Arms  Center,  tradoc,  and  the  like  for  review  prior  to  its 
implementation. 

The  workshop  reported  that  the  effectiveness  of  the  study  advisory 
group  that  is  the  principal  oversight  and  review  body  ensuring  quality 
and  consistency  in  the  models  when  they  are  used  in  tradoc’s  studies  is 
often  hampered,  because  it  is  not  ultimately  responsible  for  the  quality 
of  the  simulations  used  in  a  study.  The  study  advisory  group  is  encum¬ 
bered  by  the  large  number  of  members  and  observers  who  attend  it  and 
the  lack  of  depth  in  its  reviews.  In  addition,  the  logistics  of  setting  up  a 
large  group,  preparing  for  it,  and  attending  it  consume  valuable  time, 
especially  for  the  agency  conducting  the  study.  The  workshop  suggested 
two  options.  First,  active  “working  groups”  of  senior  analysts  should 
meet  periodically  throughout  a  study  at  critical  junctures,  not  merely  at 
convenient  milestones,  and  conduct  critical  reviews  in  depth,  analyze 
problems,  implement  solutions  with  some  autonomy,  and  report  to  the 
study  advisory  groups.  This  would  not  only  ensure  more  thorough 
review  but  would  also  permit  more  timely  corrective  action  and  redirec¬ 
tion.  Second,  smaller  executive  groups  of  senior  officials  who  could 
make  immediate  decisions  would  contribute  to  more  productive  dialogue 
and  save  time,  personnel,  and  resources. 

A  further  manifestation  of  the  Army’s  intention  to  guide  and  manage  its 
modeling  activity  are  the  two  groups  we  mention  in  chapter  6  and 
appendix  IV  that  were  constituted  at  different  times  to  oversee  the 
development  of  the  como  modeling  system.  These  groups  drew  their 
members  from  the  many  commands  and  organizations  that  have  an 
interest  in  the  development  of  the  como  models. 
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Summary 


Overall,  the  Army  appears  to  be  concerned  about  the  quality  of  its  mod¬ 
els  and  its  responsibility  to  provide  guidance  for  those  who  manage 
them.  Over  the  years,  various  management  and  procedural  improve¬ 
ments  have  been  discussed  and,  at  times,  initiated  in  the  form  of  both 
regulations  providing  guidance  to  developers  of  models  and  committees 
taking  an  active  interest  in  the  ongoing  development  of  specific  models 
and  modeling  efforts  in  general.  We  note,  however,  that  the  guidance 
generally  concentrates  on  management  aspects  and  does  not  provide 
substantive  technical  detail,  especially  concerning  the  systematic  and 
routine  evaluation  of  models. 

At  the  level  of  the  secretary’s  office,  we  found  little  guidance  with  direct 
relevance  to  simulations,  although  some  Don  directives  and  regulations 
on  related  topics  include  information  pertinent  to  them. 

In  one  area,  the  interest  in  the  quality  of  computer  software  was  ini¬ 
tially  oriented  to  systems  critical  to  military  missions  but  has  gradually 
broadened  to  encompass  computer  systems  in  general,  reflecting  devel¬ 
opments  taking  place  in  the  computer  software  field.  We  believe  that 
stronger  links  between  software  development  and  computer  modeling 
may  facilitate  more  rapid  integration  of  software  advances  into  the  pro¬ 
gramming  of  computer  models.  The  adoption  of  practices  for  assessing 
and  improving  the  credibility  of  simulations  might  be  encouraged  if 
management  gives  greater  attention  to  such  technical  aspects  of  model¬ 
ing  as  software  quality,  statistical  analysis,  and  validation. 
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Don  used  the  simulations  we  examined  to  obtain  information  about  the 
effectiveness  of  weapon  systems  for  decisions  about  acquisition.  These 
and  other  simulations  were  also  used  to  evaluate  improvements  or 
changes  in  the  systems,  force  levels,  and  operating  doctrine.  Because  the 
credibility  of  the  results  of  simulations  used  for  major  decisions  is 
important,  we  posed  three  broad  questions  about  credibility. 


The  Factors  in  a 

Systematic 

Assessment 


We  identified  14  factors  that  are  useful  in  assessing  the  credibility  of  a 
simulation  as  applied  in  a  particular  study.  The  14  factors  fall  into  three 
broad  areas  of  concern:  ( 1 )  theory,  model  design,  and  input  data.  (2 )  the 
correspondence  between  simulation  outcomes  and  real-world  outcomes, 
and  (3)  the  institutional  process  of  configuration  management,  over¬ 
sight,  and  review  and  documentation  and  reporting  practices.  Severe 
limitations  in  any  one  of  these  areas  would  lead  to  doubts  about  the 
credibility  of  a  simulation  but  for  different  reasons.  Problems  with  the¬ 
ory.  design,  or  input  data  would  pose  questions  about  the  basic  integrity 
of  the  simulation's  internal  structure.  Little  or  no  evidence  on  the  corre¬ 
spondence  of  outcomes  would  leave  insufficient  proof  of  the  extent  to 
which  the  simulation  represents  reality.  The  absence  of  efforts  with 
respect  to  the  institutional  process  would  cast  doubt  that  appropriate 
practices  had  been  used  to  ensure  quality  in  the  first  two  areas,  the  con¬ 
tinuing  integrity  of  the  model,  and  disclosure  of  its  critical  limitations. 

Our  framework  appears  to  be  appropriate  for  reviewing  the  credibility 
of  simulations  of  operational  effectiveness,  which  usually  involve  many 
weapons  against  many  targets.  We  did  not  attempt  to  apply  it  to  other 
types  of  simulations.  For  engineering  simulations,  which  often  involve 
one  weapon  against  one  target,  and  war-game  simulations,  which  often 
involve  confrontations  between  large  forces,  individual  factors  in  the 
framework  may  have  to  be  modified;  the  three  major  areas  of  concern 
should  apply  as  they  are. 

We  believe  our  framework  provides  a  structured  and  useful  way  to 
review  the  credibility  of  the  results  of  simulations  of  operational  effec¬ 
tiveness.  The  14  factors  can  guide  data  collection  and  analysis  to  help  in 
understanding  both  the  strengths  of  the  simulations  that  would  enhance 
confidence  in  using  the  results  and  limitations  of  them  that  threaten 
confidence  and  point  to  the  need  for  remedial  efforts. 
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The  Results  of 

Reviewing 

Operational- 

Effectiveness 

Simulations 


DOD’s  Efforts 


Nonexistent  or  weak  evidence  of  validation  efforts  (factor  1 1 )  posed  a 
mtyor  threat  to  credibility  in  all  three  case  study  simulations.  Validating 
a  simulation’s  results  by  comparing  them  to  real-world  results  is  a  diffi¬ 
cult  problem  in  weaponry.  It  cannot  be  solved  easily  but  would  be 
helped  by  more  efforts  first  to  identify  appropriate  data  sources  and 
methods  for  validation  comparisons  and  then  to  use  them. 

According  to  our  review,  credibility  was  consistently  supported  by  only 
a  few  of  the  factors  in  our  framework  for  the  three  simulations.  All 
three  simulations  were  fairly  strong,  w-ith  some  limitations,  at  including 
important  measures  of  effectiveness  (factor  2),  modeling  weapon-to-tar- 
get  engagement  (part  of  factor  4),  and  testing  the  parameters  of  models 
and  running  the  models  with  alternative  scenarios  (factor  10).  The 
reports  on  all  three  simulations  were  relatively  complete  in  discussing 
the  strengths  and  weaknesses  of  the  analyses  ( factor  14). 

Despite  these  strengths,  the  limitations  of  other  factors  reduced  credibil¬ 
ity  and  thereby  the  usefulness  of  the  simulations.  Therefore,  we  believe 
it  would  be  imprudent  to  use  the  results  directly  in  major  acquisition 
decisions  without  correcting  the  weaknesses.  We  believe  that  even  with 
these  limitations,  the  results  can  be  used  in  an  exploratory  way  to  iden¬ 
tify  possible  problems  in  the  weapon  systems.  With  greater  caution, 
they  might  also  be  used  for  extending  evidence  on  weapon-system  per¬ 
formance  to  cover  many  more  conditions  than  would  be  possible  in  field 
tests.  A  simulation's  results  may  be  quite  valuable  for  these  purposes 
within  the  constraints  imposed  by  its  limitations. 


The  office  of  the  secretary  of  the  Department  of  Defense  has  issued  no 
formal  guidance  specifically  for  the  management  of  simulations  or  how 
to  conduct  them  and  assess  their  credibility.  Although  several  directives 
and  at  least  one  military  standard  have  some  bearing  on  simulations,  we 
found  no  documented  evidence  that  the  secretary’s  office  has  sought  to 
develop  and  implement  appropriate  quality  controls  that  could  be 
expected  to  directly  improve  the  credibility  of  simulations. 

The  Army  has  been  more  active  in  fostering  the  development  of  organi¬ 
zations  and  guidance  that  can  directly  influence  the  credibility  of  simu¬ 
lations’  results.  Several  Army  organizations — parts  of  the  command 
structure  as  well  as  less  formal  working  groups — have  roles  in  oversee¬ 
ing  and  upgrading  simulations.  The  Army  has  also  issued  several  regula¬ 
tions  and  a  handbook  that  emphasize  specific  aspects  of  configuration 
management  and  reporting  results. 
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We  conclude  that  the  Army’s  efforts  are  noteworthy  in  both  intent  and 
performance  but  that  additional  actions,  especially  more  guidance  on 
the  technical  aspects  of  simulations  and  requirements  for  validation, 
would  improve  simulations  and  thereby  enhance  their  credibility. 


Recommendations 


We  support  the  efforts  dod  has  made  to  develop  and  sustain  credible 
simulations.  We  recommend  that  to  reinforce  these  efforts  and  to  ensure 
that  such  practices  are  followed,  the  secretary  of  the  Department  of 
Defense  develop  and  implement  guidance  on  producing,  validating,  doc¬ 
umenting,  managing,  maintaining,  using,  and  reporting  weapon-system 
effectiveness  simulations.  The  guidance  should  include  a  provision  for 
routine  reviews  of  a  simulation’s  credibility  and,  in  this  way,  the  identi¬ 
fication  of  problems  that  should  be  resolved.  The  secretary  should  also 
explore  the  possibility  of  requiring  that  a  statement  regarding  validation 
accompany  the  report  of  a  simulation’s  results. 

We  recommend  that  to  make  the  adage,  Carmonette,  and  como  III  models 
more  useful  in  future  applications,  the  agency  responsible  for  managing 
each  simulation  explore  the  feasibility  of  remedying  the  limitations  we 
identified,  especially  in  the  area  of  validation. 


DOD’s  Comments  and 
Our  Response 

dod’s  comprehensive  and  detailed  review  indicates  clearly  that  simula¬ 
tion  is  an  area  of  importance  to  dod,  one  in  which  it  agrees  that  improve¬ 
ments  can  and  should  be  made. 

The  letter  transmitting  dod’s  response  raises  concerns  about  generalizing 
from  three  case  studies  and  asserts  that  the  report  does  indeed  do  this 
without,  however,  citing  specific  examples  to  support  this  assertion. 
From  our  perspective,  we  made  every  effort  to  avoid  inappropriate  gen¬ 
eralization,  and  we  believe  we  were  successful.  A  major  focus  of  our 
study  was  to  demonstrate  that  one  can  systematically  collect  and  ana¬ 
lyze  information  about  a  simulation  that  would  permit  one  to  assess  the 
credibility  of  that  simulation.  Using  operational-effectiveness  simula¬ 
tions,  our  three  case  studies  show  the  feasibility  of  an  approach  for  sim¬ 
ulations  of  that  kind.  We  do  not  infer  from  these  case  studies  anything 


dod  commented  on  a  draft  of  this  report;  our  response  appears  in  appen¬ 
dix  V.  dod  attributed  21  findings  to  the  report,  concurring  fully  with  19 
and  concurring  partially  with  2.  dod  concurred  with  the  two  recommen¬ 
dations  presented  in  the  report. 
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with  regard  to  the  credibility  of  other  simulations.  Our  recommenda¬ 
tions  are  based  on  both  our  review  of  dod’s  effort  to  foster  and  reinforce 
the  credibility  of  simulations  and  our  case  study  analyses. 

In  its  letter,  dod  highlighted  one  of  the  two  “findings”  to  which  it  gave 
only  partial  concurrence — namely,  that  applying  our  framework  to 
assess  credibility  gives  only  part  of  the  picture  because  quality  depends 
also  on  the  persons  involved,  the  input  data  choices,  and  the  way  the 
model  is  applied.  We  certainly  agree  that  these  are  important  contribu¬ 
tions  to  an  assessment  of  a  simulation’s  credibility,  but  we  do  not  agree 
that  our  framework  excludes  these  factors.  In  fact,  the  application  of 
models  is  considered  under  factor  1  of  our  framework,  input  data  is  the 
focus  of  factor  7,  and  persons  involved  is  included  under  factor  12.  In 
the  report,  we  have  tried  to  indicate  the  importance  of  these  and  other 
elements. 

The  other  finding  to  which  dod  gave  only  partial  concurrence  was  our 
concern  about  the  use  of  the  expected-value  method  for  representing  the 
mathematical  relationships  in  the  engagement  of  multiple  air  defense 
weapons  against  multiplane  attacks  in  the  adage  Campaign  submodel. 

By  pointing  out  several  limitations,  we  did  not  intend  to  imply  that  the 
expected-value  approach  is  intrinsically  bad.  The  concerns  we  reported 
were  raised  either  by  dod  personnel  themselves  or  by  experienced  mod¬ 
els  practitioners.  Moreover,  we  tempered  our  criticisms  in  this  area  with 
other  statements  in  the  report  pointing  out  that  the  theoretical  approach 
of  the  adage  was  appropriate  for  addressing  decisions  concerning  com¬ 
peting  air  defense  weapons  even  though  it  was  an  expected-value  model. 


i 
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In  this  appendix,  we  define  terms  commonly  associated  with  simulation 
models  and  explain  the  simulations  used  in  the  weapon-system  acquisi¬ 
tion  programs  for  the  divad  and  the  Stinger. 


Definition  of  Terms 


Simulation  is  the  overall  process  in  which  a  system  is  modeled  and  the 
model  is  experimented  with.  In  this  report,  “model”  refers  to  the  repre¬ 
sentation  of  an  object,  a  system,  an  activity,  or  a  situation  by  something 
other  than  itself.  It  might  be  a  logical,  mathematical,  or  physical  repre¬ 
sentation  or  a  combination  of  these.  A  model  represents  the  system,  its 
elements  (or  variables),  and  the  relationships  between  the  elements  that 
govern  their  interaction. 

The  types  of  simulations  or  models  of  combat  the  military  services  use 
to  support  decisions  are  often  described  or  categorized  in  several  ways: 

in  terms  of  the  numbers  of  friendly  versus  enemy  units  or  systems 
engaged  in  combat  events,  from  one-on-one  to  one-on-few,  many-on- 
many,  or  theater-level  interactions; 

in  terms  of  the  organizational  levels  of  the  units  engaged,  from  battalion 
to  corps  or  division  to  theater; 

in  terms  of  the  degree  of  detail  in  depicting  combat  events,  whether 
high-resolution  simulations  that  depict  smaller  units  in  fine  detail  or 
low-resolution  or  large-scale  simulations  that  depict  larger  units  in 
highly  aggregated  variables. 

Simulations  are  also  categorized  by  the  techniques  they  employ.  A  com¬ 
puter  simulation  is  a  model  of  a  weapon’s  behavior  in  combat  that  is  run 
entirely  on  a  computer.  A  hardware-m-the-loop  simulation  substitutes 
one  or  more  actual  components  of  weaponry  for  a  portion  of  the  model, 
the  remainder  of  the  model  being  handled  by  computer.  A  man-in-the- 
loop  simulation  places  a  human  being — a  radar  operator  or  pilot,  for 
example — into  direct  interaction  with  the  computer  or  hardware-in-the- 
loop  simulation.  , 

Simulation  models  may  be  further  classified  as  stochastic  or  determinis¬ 
tic.  A  stochastic  simulation  model  (described  by  some  authors  as  a 
Monte  Carlo  or  probabilistic  model)  has  one  or  more  random  variables  as 
inputs.  Since  random  inputs  lead  to  random  outputs,  they  can  be  consid¬ 
ered  only  statistical  estimates  of  the  true  characteristics  of  the  model. 
Simulation  models  that  contain  no  random  variables  are  deterministic. 
For  a  given  set  of  input  data,  deterministic  simulation  models  provide  a 
unique  set  of  outputs. 
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In  the  context  of  simulations  and  models,  hierarchy  refers  to  a  vertical 
sequencing  relationship  in  which  the  outputs  of  one  model  provide 
inputs  to  a  more  aggregated  model.  However,  a  sequence  of  models  or 
simulations  in  weapons  acquisition  may  refer  to  the  order  in  which  mod¬ 
eling  and  simulation  are  performed.  Generally,  the  order  is  from  com¬ 
puter  simulations  of  subsystems  up  to  the  full  system  in  its  operational 
environment  to  hardware-in-the-loop  simulations  to  man-in-the-loop 
simulations. 


The  Use  of 
Simulations  for  Two 
Weapon  Systems 


Simulations  were  used  extensively  in  the  development  of  the  divad  and 
the  Stinger  weapon  systems.  The  program  offices  for  both  noted  that  as 
their  budgets  became  tighter  and  the  systems  more  costly,  they  made 
greater  use  of  simulations  to  augment  data  from  physical  tests.  The  one- 
on-one,  item-engineering  models,  with  or  without  hardware-in-the-loop, 
were  used  to  assess  technical  performance.  Force-on- force  simulations 
were  used  to  assess  operational  effectiveness. 


The  DIVAD  Prior  to  the  Army’s  1976  decision  to  develop  a  new  air  defense  gun  to 

replace  the  VULCAN'  air  defense  gun,  the  Army  Materiel  Systems  Analy¬ 
sis  Activity  had  constructed  and  validated  antiaircraft  gun  models.  In 
1971,  during  the  gun  air  defense  effectiveness  study,  a  simulation  model 
for  the  VULCAN  was  built  and  validated  with  field-test  data.  Later, 
other  air  defense  gun  simulation  models  were  built  and  validated,  using 
data  from  the  gun  low-altitude  air  defense  test.  These  models — the  Fire 
Unit  Effectiveness  model  for  the  VULCAN  and  the  Modern  Gun  Effec¬ 
tiveness  Model  for  the  divad — were  the  basis  for  all  the  Army  Materiel 
Systems  Analysis  Activity  one-on-one  air  defense  gun  studies  during  the 
mid-1970's.  The  models  were  modified  to  simulate  other  air  defense  gun 
systems  and  validated  with  field  data. 

The  two  contractors  that  were  selected  to  build  the  prototype  divad  gun 
systems  (Ford  Aerospace  and  Communication  Corporation  and  General 
Dynamics  Corporation)  were  asked.to  develop  computer  simulations 
concurrently  and  to  base  them  on  the  Modern  Gun  Effectiveness  Model 
to  represent  their  respective  systems.  In  1980.  the  Army  validated  these 
models  with  data  from  the  field  tests  of  the  prototypes. 

Since  1977,  several  studies  and  analyses  have  used  force-on-force  simu¬ 
lations  to  investigate  the  need  for  and  contributions  of  the  divad  gun.  In 
1977,  the  Army  reported  on  the  cost  and  operational-effectiveness  anal¬ 
ysis  of  the  division  air  defense  gun.  The  report  examined  whether  the 
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procurement  of  a  divad  gun,  as  one  component  of  future  air  defense 
weaponry,  was  the  most  cost-effective  solution  for  air  defense  missions. 
The  adage  simulation  was  created  to  perform  this  analysis,  and  a 
generic  35-mm  gun  was  modeled.  The  recommendation  was  to  proceed 
with  the  development  of  the  divad  gun  and  to  place  36  divad  guns  per 
division  in  the  field. 

The  division  air  defense  gun  cost  and  operational-effectiveness  analysis 
update,  completed  in  June  1984,  addressed  concerns  regarding  opera¬ 
tional  and  developmental  test  results  and  new  threat  projections.  The 
Army  Air  Defense  Artillery  School  was  instructed  to  use  the  Carmonette 
in  this  analysis,  which  was  specifically  designed  to  address  the  effec¬ 
tiveness  of  the  performance  of  the  gun  as  indicated  both  by  test  data  (an 
"as  tested"  version)  and  by  expected  production  characteristics  (a 
"mature'’  version).  The  study,  conducted  by  the  Army  tradoc  Systems 
Analysis  Activity,  concluded  that  force  effectiveness  increased  when 
the  divad  was  added  to  the  forces,  even  with  performance  shortfalls 
shown  by  testing  and  significant  increases  in  the  projected  threat.  The 
Army  Air  Defense  Artillery  School  also  conducted  additional  analyses 
using  the  adage. 

The  divad  force  structure  analysis,  an  offshoot  of  the  update,  supported 
the  recommendation  of  36  divad  guns  in  the  1977  analysis.  Decisions 
supported  by  these  analyses  led  to  the  exercising  of  options  1  and  II  of 
the  contract  with  Ford  Aerospace  and  Communication.  A  decision  on 
option  III  was  deferred  until  the  fall  of  1985  to  allow  testing  for  opera¬ 
tional  effectiveness,  suitability,  and  limited  production.  To  support  the 
review  process  for  option  III  and  assist  the  secretary  of  Defense  in 
deciding  whether  to  continue  with  the  production  of  the  divad  gun.  a 
comparative  analysis  was  directed  by  the  Department  of  the  Army.  The 
analysis,  which  used  the  adage  and  Carmonette  models,  examined  the 
ability  of  the  divad  to  perform  its  designated  mission  within  its  postu¬ 
lated  initial  operational  capability  on  the  battlefield.  It  also  examined 
the  ability  of  alternative  weapon  systems  to  perform  the  same  mission. 

The  adage  helped  determine  the  effectiveness  of  air  defense  systems  in 
terms  of  resources  saved  in  a  division.  In  a  parallel  effort,  the  model  was 
also  list'd  to  determine  the  operational  effectiveness  of  the  divad  for  dif¬ 
ferent  levels  of  its  performance  parameters,  and  the  results  determined 
the  levels  of  degradation  at  which  the  divad  would  become  less  effective 
than  the  alternative  systems  under  consideration.  The  results  of  the 
effectiveness  analysis  were  used  to  compare  the  operational  effective¬ 
ness  of  the  divad' s  alternatives.  The  Carmonette  model  examined  the 
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alternatives  in  the  context  of  an  intense  battle  with  a  battalion  task 
force. 


The  Stinger  At  the  engineering  level,  digital,  analog-digital,  and  hardware-in-the- 

loop  simulations  have  played  a  major  role  in  the  Stinger's  development 
and  product  improvement.  At  least  three  such  simulation  capabilities 
have  been  developed.  General  Dynamics,  the  contractor,  verified  a  simu¬ 
lation  with  various  types  of  flight  and  nonflight  tests.  When  the  output 
of  the  simulations  was  confirmed,  the  results  could  be  used  as  a  design 
tool.  The  Army  used  a  similar  simulation  at  ITS.  Army  Missile  Command 
to  validate  the  contractor’s  performance  data  and  to  investigate 
improvement  alternatives.  A  third  simulation  was  developed  at  the 
Office  of  Missile  Electronic  Warfare  to  evaluate  electronic  counter¬ 
measure  and  counter-countermeasure  performance  and  to  assess 
vulnerability. 

The  Stinger’s  operational  combat  effectiveness  was  assessed  with  the 
Tactical  Air  Defense  Computer  Operational  Simulation  in  a  cost  and 
operational-effectiveness  analysis  reported  in  1977.  Several  alternative, 
portable  air  defense  systems  were  evaluated  under  identical  situations, 
including  a  comparison  of  the  relative  effectiveness  of  the  Stinger  and 
the  Redeye  in  various  environments. 

Another  study  focusing  on  operational  employment  issues  used  the  COMO 
III  to  investigate  the  Stinger’s  battery -coolant -unit  use  rates  in  a  war¬ 
time  environment.  This  was  the  study  we  reviewed,  because  it  was  rea¬ 
sonably  well  dcK'umented,  the  model  on  which  it  was  based  was  well 
documented,  and  the  programmers,  analysts,  and  managers  were  still 
available  for  interviews  and  questions. 
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The  Theoretical 
Approach 


A  model’s  basic  theoretical  approach  for  evaluating  the  effectiveness  of 
a  weapon  system  may  be  engineering,  to  determine  the  optimal  design  of 
the  weapon  systems;  functional,  to  aid  in  selecting  the  most  effective 
weapon  system  from  alternative  systems  performing  the  same  function 
(for  example  air  defense);  or  combined  arms,  to  compare  alternative 
uses  of  competing  weapon  systems  (for  example,  air  defense  weapons 
versus  helicopters  versus  tanks). 

The  Carmonette  is  a  combined-arms  model  designed  to  answer  broad 
trade-off  questions  about  armor,  infantry,  artillery,  and  the  like.  It 
focuses  on  the  total  ground  battle,  not  individual  weapon  systems;  air 
defense  considerations  have  only  recently  been  added  to  the  model.  The 
adage,  in  contrast,  is  basically  an  expected- value  model,  designed  to 
study  the  effectiveness  of  combinations  of  ground-based  weapons  in 
providing  air  defense  to  a  division.  The  como  Ill  was  likewise  designed  to 
study  the  various  factors  involved  in  providing  air  defense  but,  like  the 
Carmonette,  it  is  a  Monte  Carlo  model  and  it  operates  at  high  resolution. 
(We  have  summarized  the  three  models’  theoretical  approaches  in  table 
4.2.) 

As  functional  models,  the  adagk  and  como  III  emphasize  the  adequacy  of 
air  defense;  the  other  aspects  of  war,  where  they  are  included,  are  con¬ 
centrated  on  how  changes  in  air  defense  capability  can  change  battle 
outcomes.  However,  the  emphasis  of  both  models  is  air  defense,  not  the 
total  battle.  Even  critics  of  the  adagk  agree  to  its  usefulness  in  making 
decisions  between  air  defense  systems.  The  adage  and  como  III  are  also 
systems-analysis  models  in  that  they  are  designed  to  provide  informa¬ 
tion  to  decisionmakers  concerning  various  alternatives  for  providing  air 
defense  and  are  not  useful  for  considering  trade-offs  between  air 
defense  and  other  wartime  functions. 

Why  should  the  differing  theoretical  approaches  of  the  Carmonette. 
adage,  and  como  make  any  difference?  With  the  emphasis  of  the  adage 
and  como  on  air  defense,  only  a  less-detailed  portrayal  of  the  remainder 
of  the  war  may  be  sufficient  to  judge  the  trade-off  between  competing 
air  defense  systems.  The  Carmonette’s  emphasis  on  combined  arms  in 
the  total  battle  means  that  some  elements  a  e  often  omitted  or  aggre¬ 
gated  in  simulations  of  air  defense  in  a  manner  such  that  important 
information  can  sometimes  be  lost. 

In  our  op.  ,:on,  the  basic  approaches  of  the  adage  and  como  are  more 
appropriate  for  studying  air  defense  trade-offs  than  a  combined-arms 
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model  like  the  Carmonette,  which  has  to  be  modified  to  accommodate  air 
defense. 


The  protection  of  operational  and  strategic  assets  from  enemy  aircraft  is 
the  primary  mission  of  U.S.  air  defense  forces;  the  attrition  of  enemy 
aircraft  is  secondary.  Although  the  Carmonette  may  be  able  to  produce 
information  on  protection,  its  emphasis  in  the  diyad  analyses  was  on 
attrition.  It  stressed  the  comparison  of  the  loss  of  enemy  forces  to  the 
loss  of  friendly  forces  in  the  form  of  various  exchange  ratios.  (We  have 
summarized  the  Carmonette  and  the  adage  and  como  III  in  table  4.3) 

In  its  analyses,  the  Carmonette  produced  “killer-victim  scoreboards,"  or 
matrixes  comparing  kills  of  all  types  of  enemy  aircraft  by  all  types  of 
friendly  air  defense  weapons  and  kills  of  all  types  of  ground  targets  by 
enemy  aircraft.  The  figures  from  the  matrixes  were  used  for  compari¬ 
sons  of  the  effectiveness  of  weapons.  The  principal  force-effectiveness 
measures  reported  in  the  Carmonette  were  the  loss-exchange  ratio  (or 
the  total  enemy  losses  divided  by  total  friendly  losses)  and  the  frac¬ 
tional  exchange  ratio  (the  percentage  of  enemy  losses  divided  by  the 
percentage  of  friendly  losses).  Systems  ratios  permitted  the  comparison 
of  losses  of  friendly  weapons  to  losses  of  one  target  or  all  targets  against 
which  the  weapon  was  used  (for  example,  the  diyad  against  the  HIND 
helicopter  or  the  divad  against  all  target  aircraft). 

The  emphasis  in  all  these  comparisons  was  attrition.  No  differentiation 
was  made  between  the  relative  worth  of  assets  lost.  Other  measures  of 
effectiveness  reported  in  the  Carmonette  analyses  were  the  number  of 
helicopter  remaskings  caused  by  radar  warning  and  the  number  of  mis¬ 
sion  aborts  caused  by  damage  to  enemy  aircraft  from  ground  fire.  These 
measures  were  not  covered  in  the  adagk. 

Although  the  adage  can  produce  statistics  that  can  be  converted  into  the 
same  type  of  attrition  statistics  that  the  Carmonette  does,  the  effective¬ 
ness  measure  emphasized  in  the  adage  cost  and  operational-effective¬ 
ness  analysis  was  the  protection  of  assets.  Friendly  assets  were  assigned 
a  value  called  “military  worth,”  the  assets  having  a  military  value  to  the 
enemy  as  well  as  to  friendly  forces.  Military  worth  to  the  enemy  was 
used  in  enemy  air-raid  allocations;  the  principal  measure  of  effective¬ 
ness  was  the  military  worth  of  friendly  forces  remaining  after  enemy 
raids.  The  analysis  also  reported  the  worth  of  individual  classes  of 
targets  remaining  and  showed  how  military  worth  declined  over  several 
days  of  fighting  and  how  much  of  the  loss  of  friendly  military  worth 
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was  attributable  to  the  ground  war  only  and  to  ground  and  air  wars 
combined. 

The  proportion  of  loss  attributable  to  enemy  fixed-wing  aircraft  was  a 
major  source  of  concern  to  the  study  advisory  group  and  the  critics  of 
the  adage.  Ground  damage  attributable  to  enemy  aircraft  was  so  great 
that  credibility  was  questioned  in  comparison  to  the  Carmonette  and 
other  models.  These  concerns  and  the  possibility  that  the  adage  may 
overstate  damage  by  fixed-wing  aircraft  means  that  this  aspect  of  the 
adage  modeling  may  need  refining.  Nevertheless,  from  the  theoretical 
perspective,  it  seems  able  to  report  measures  of  effectiveness  that  are 
appropriate  to  air  defense. 

Since  the  COMO  III  does  not  model  interactions  between  ground  forces,  it 
is  limited  in  its  ability  to  use  preservation  as  a  principal  measure  of 
effectiveness.  While  analysis  in  the  como  III  may  concentrate  on  mea¬ 
sures  of  attrition,  its  flexibility  allows  a  wide  range  of  measures  of 
effectiveness.  One  example  is  its  use  in  the  analysis  of  Stinger  battery- 
coolant-unit  usage,  where  the  output  measure  was  the  number  of  units 
needed  to  fire  each  missile. 

A  chronological  description  of  the  critical  events  of  a  como  III  simulation 
is  available  in  summary  form.  The  measures  of  effectiveness  are  the 
analyst’s  choice.  They  are  based  on  the  raw  material  of  the  simulation 
history,  which  includes  detection  attempts,  detected  targets,  completed 
reloads,  the  availability  of  a  system,  missile  intercepts,  threat  attrition, 
the  amount  of  munitions  used,  and  kill  ranges,  among  other  things.  This 
information  is  available  by  fire  unit,  platoon,  battery,  battalion,  or  sce¬ 
nario,  and  it  is  further  processed  into  report  outputs  summarizing  the 
activity  at  a  site  and  the  effectiveness  of  threats  and  air  defense. 

Each  simulation  addressed  measures  of  effectiveness  in  operational 
terms,  the  adage  better  than  the  Carmonette  or  como  III,  since  it  pro¬ 
duced  measures  related  to  protection  in  addition  to  the  attrition  of 
enemy  aircraft  and  war-exchange  ratios.  The  Carmonette  might  have 
produced  this  type  of  information,  but  it  did  not  address  this  facet  of 
the  air  defense  mission.  The  como  did  not  address  this  measure  since  it 
did  not  cover  the  ground  war  at  all.  However,  the  como  III  was  able  to 
produce  a  measure  of  effectiveness  especially  designed  for  the  study  of 
the  battery  coolant  unit. 
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The  Portrayal  of  the 
Weapon  System’s 
Immediate 
Environment 


Once  a  model’s  theoretical  approach  is  understood,  one  can  assess  how 
well  it  treats  the  critical  aspects  of  a  weapon  system’s  behavior  in  tacti¬ 
cal  combat.  How  the  model  formulates  them  determines  the  critical  vari¬ 
ables  to  be  considered  and  how  the  variables  relate  to  one  another  in 
describing  not  only  the  behavior  of  the  weapon  system  but  also  the 
overall  war  environment  in  which  the  weapon  system  is  to  be  used.  We 
believe  it  is  important  in  the  evaluation  of  the  model’s  portrayal  of  the 
various  characteristics  of  a  weapon  system  to  consider  both  the 
weapon’s  tactical  environment  and  how  it  operates  in  combat.  The  tacti¬ 
cal  environment  involves  such  features  as  the  size  and  duration  of  bat¬ 
tle,  the  potential  target  set  of  a  weapon  system,  the  deployment  and 
movement  of  the  system,  and  the  terrain  in  which  it  is  to  operate.  (We 
have  summarized  the  issues  of  environment  in  table  4.4.) 


Level  of  Battle  Since  the  divad  was  to  be  a  divisional  rather  than  a  battalion  or  some 

other  air  defense  weapon,  the  adage  model,  developed  specifically  to 
address  the  divad  gun,  treats  the  weapon  as  a  division  weapon.  The 
Carmonette,  however,  addresses  sections  of  the  battlefield  only  up  to 
the  battalion  level  and,  thus,  could  preclude  the  weapon  from  engaging 
some  targets  it  was  designed  to  kill  or  suppress.  Moreover,  not  all  the 
Carmonette  analyses  included  the  effects  of  all  battalion  divad  guns 
because  of  the  small  block  of  terrain  being  modeled.  Critics  of  the 
Carmonette  as  a  tool  for  analyzing  the  divad  assert  that  air  defense  is  a 
division  responsibility  and  that  some  aspects  of  the  surface-to-air  battle 
are  overlooked,  because  the  focus  is  limited  to  a  battalion  battle. 

Unlike  either  the  adage  or  Carmonette,  the  como  III  can  be  played  at  any 
level,  one-on-one,  battalion,  division,  or  even  theater  conflicts.  For  the 
analysis  of  the  Stinger’s  battery  coolant  unit,  the  analysts  selected  a 
front-to-rear  brigade  slice,  a  representation  of  an  area  they  believed 
encompassed  a  sufficiently  large  number  of  air  defense  units  and  threat 
aircraft  and  helicopters  to  provide  a  realistic  exercise.  The  activities  of 
99  Stinger  units  and  more  than  300  threat  aircraft  were  represented  in 
the  analysis. 

The  fact  that  the  Carmonette  focuses  on  an  intense,  25-minute  battalion 
battle,  as  opposed  to  the  adage’s  small  raids  by  enemy  aircraft  against 
targets  in  the  division  over  several  days,  is  also  of  some  i  incern.  A  con¬ 
flict  simulated  with  the  adage  can  last  up  to  30  days,  and  logistics  are 
included.  The  Carmonette  battle  covers  less  than  10  percent  of  the  terri¬ 
tory  of  an  adage  battle  and  includes  4  divad  guns,  while  the  adage  uses 
36.  The  Carmonette  emphasizes  the  effects  of  aircraft  only  in  the  main 
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battle  area,  whereas  the  adage  also  portrays  the  effects  of  aircraft 
against  combat  support  units  to  the  rear  of  the  division.  Moreover,  when 
it  comes  to  measuring  the  potential  damage  attributable  to  enemy  air¬ 
craft,  a  25-minute  firefight  cannot  be  directly  compared  to  a  battle  of 
several  days.  In  effect,  the  adage  purports  to  model  the  results  of  sev¬ 
eral  Carmonette  battles  and  measures  the  cumulative  effect  of  enemy 
air  attacks  on  the  ability  of  friendly  forces  to  wage  war. 

The  analyst  chooses  the  level  of  play — battalion,  brigade,  division,  or 
higher — for  the  como  III  but  the  model  is  limited  in  its  ability  to  play 
battles  of  extended  length,  since  it  does  not  model  logistics.  The  study  of 
the  battery  coolant  unit,  whose  purpose  was  to  determine  the  number  of 
units  each  Stinger  required  in  wartime,  worked  with  the  initial  supply 
position  and  did  not  address  resupply,  como  documents  indicate  that  a 
typical  simulation  represents  about  2  hours  of  real  time.  The  complexity 
of  the  Stinger  scenario  and  environment  was  limited  in  order  to  reduce 
the  resources  required  for  computer  runs. 


Targets  Another  significant  difference  between  the  adage  and  Carmonette  in  the 

treatment  of  the  divad  was  the  weapon’s  potential  set  of  targets.  The 
adage  modeled  nonjinking  helicopters  and  fixed-wing  aircraft  with  fixed 
flight  paths  as  potential  threats  and  dealt  with  the  damage  from  fixed- 
wing  attacks  in  the  rear  as  well  as  forward  areas  of  the  division.  Fixed- 
wing  aircraft  were  not  included  in  most  of  the  analyses  using  the 
Carmonette.  The  t radix'  studies  advisory  group  recognized  the  omission 
as  a  serious  deficiency  but  did  not  demand  changes  to  the  Carmonette 
model.1 

Even  when  the  Carmonette  finally  addressed  fixed-wing  aircraft,  it  did 
so  by  using  information  produced  by  another  model  that  addressed  sur¬ 
face-to-air  gun  attacks  in  essentially  t  he  same  manner  as  the  adage.  The 
Carmonette  was  modified  after  the  last  divad  study  to  include  a  fixed- 
wing  component,  but  no  analyses  of  the  divad  were  made  with  it  because 
the  divad  program  was  cancelled. 


'TRADOC’s  study  advisory  groups  monitor  the  progress  of  its  studies  and  review  and  provide  advice 
on  the  planning,  performance,  and  reporting  of  specific  studies  to  both  the  agencies  conducting  them 
and  the  agencies  directing  that  they  be  done.  Group  members  represent  interested  organizations  that 
know  aspects  of  a  particular  study  but  are  not  directly  involved  in  it  They  meet  three  or  more  times 
at  critical  points  during  a  study,  and  subgroups  review  the  more  technical  matters,  such  as  analyst's, 
costs,  scenarios,  doctrine,  and  threats.  The  minutes  of  a  study  advisory  group  meeting  can  become 
directives. 
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Weapon  Deployment  and 
Movement 


The  results  of  using  an  ADAGE-type  modeling  approach  in  conjunction 
with  the  Carmonette  led  to  the  conclusion  that  fixed-wing  aircraft  were 
not  a  significant  threat  to  assets  of  combat  ground  units  in  the  forward 
part  of  the  main  battle  area,  a  conclusion  that  contradicted  a  conclusion 
from  the  adage  model  alone.  The  difference  came,  to  a  large  degree, 
from  the  Carmonette’s  focus  at  the  battalion  level,  where  fixed-wing  air¬ 
craft  may  not  be  significant,  compared  to  the  adage’s  focus  at  the  divi¬ 
sion  level,  where  the  damage  from  fixed-wing  aircraft  is  a  more 
important  consideration.  It  is  not  clear  that  including  a  fixed-wing  com¬ 
ponent  would  overcome  the  difficulties  resulting  from  the  Carmonette's 
more  limited  concentration. 

In  the  COMO  III,  the  Stinger  could  attack  helicopters  and  fixed-wing  air¬ 
craft.  The  study  of  the  battery  coolant  unit  included  both  air  threats. 
Most  COMO  III  modeling,  however,  has  concentrated  on  the  threat  from 
fixed-wing  aircraft. 


Another  important  aspect  of  modeling  the  use  of  a  weapon  is  how  a 
model  portrays  the  weapon’s  deployment  and  movement  on  the  battle¬ 
field.  The  analyst  determines  the  tactics,  deployment,  and  decision  rules 
that  are  to  become  input  for  the  Carmonette.  The  reports  on  the 
Carmonette’s  simulation  of  the  divad  indicate  that  the  analysts  studied 
the  effectiveness  of  the  alternative  deployment  of  weapons.  While  there 
was  some  concern  about  the  appropriate  portrayal  of  the  divad's  deploy¬ 
ment  in  the  Carmonette  analyses,  the  concerns  were  about  the  analysts' 
input  rather  than  the  fundamental  theory  of  weapons  deployment. 

The  Carmonette  has  a  submodel  that  uses  mobility  factors  as  inputs  to 
treat  movement  on  the  battlefield.  The  Carmonette  allows  weapons  to 
move  in  response  to  firing,  permits  well-defined  movement  patterns,  and 
allows  intermediate  stops  in  them.  At  one  time,  the  Carmonette  would 
not  allow  the  divad  to  fire  on  the  move,  but  this  problem  was  corrected 
in  the  analyses.  Movement  rates  in  the  Carmonette  were  affected  by  the 
environment:  the  mode  of  movement,  terrain  slopes,  and  ground  condi¬ 
tions  such  as  the  presence  of  paved  roads,  dirt  roads,  no  roads,  and  so 
on.  On  the  whole,  the  Carmonette’s  treatment  of  weapon  deployment 
and  movement  was  suitable  for  the  divad. 

In  contrast,  the  adage  assumes  a  static  deployment.  It  deploys  weapons 
in  rectangles  or  zones  of  terrain.  A  division's  dimensions  are  input  for 
the  adage  model,  and  for  purposes  of  computing  aircraft  attrition,  it 
partitions  a  division  into  zones  parallel  to  the  forward  edge  of  the  battle 
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area.  Air  defense  weapons  within  one  zone  are  assumed  to  be  uniformly 
distributed.  The  adagk  gives  some  indirect  recognition  to  deployment, 
since  the  one-on-one  Incursion  places  air  defense  weapons  randomly  rel¬ 
ative  to  aircraft  flight  paths  in  several  replications  that  determine  one- 
on-one  attrition  factors.  These  factors  are  used  in  the  many-on-many 
simulation  in  the  Campaign  submodel,  in  which  air  defense  weapons  are 
assumed  to  be  uniformly  distributed  within  each  zone  of  the  battlefield 
modeled.  Thus,  the  adagk  results  are,  in  effect,  the  average  of  several 
randomly  generated  weapon  deployments. 

Not  only  does  the  adage  not  directly  portray  how  weapons  are  deployed; 
it  also  does  not  portray  the  movement  of  the  divad.  The  Incursion  does 
not  portray  the  movement  of  air  defense  units.  It  is  possible  that  move¬ 
ment  is  portrayed  indirectly  in  the  Campaign,  since  it  applies  a 
probability-of-participation  factor  to  ground-to-air  attrition  rates  in 
determining  final  attrition  rates.  The  movement  of  air  defense  units  may 
be  partially  portrayed  by  adjusting  these  factors  to  represent  the 
"nonavailable”  time  caused  by  the  movement  of  the  weapon.  On  the 
whole,  however,  the  adage’s  treatment  of  weapon  deployment  and 
movement  has  to  be  considered  less  adequate  than  the  Carmonette's. 

Like  the  Carmonette,  the  como  III  deploys  the  Stinger  according  to  the 
analyst’s  specifications,  but  like  the  adage,  it  does  not  specifically  model 
the  movement  of  defensive  weapons,  except  aircraft.  Rather,  it 
addresses  movement  through  the  lessening  of  the  probability  of  partici¬ 
pation.  The  como  III  allows  individual  Stinger  units  to  become  opera¬ 
tional  or  nonoperational  at  specific  times,  a  capability  that  may  be  used 
to  roughly  simulate  movement.  The  individual  Stinger  teams,  however, 
are  given  specific  locations  by  the  analyst.  The  Army’s  field  manual  on 
the  Stinger’s  team  operations  emphasizes  that  frequent  movement  as  far 
as  several  hundred  meters  contributes  to  survival.  Moving  after  each 
firing,  unless  there  is  another  aircraft  to  be  engaged,  could  affect  the 
time  that  the  team  is  actually  in  operation.  Neither  the  greater  likeli¬ 
hood  of  survival  nor  a  decrease  in  operations  because  of  the  team’s 
movement  appears  to  be  directly  included  in  the  como  III  model. 


Terrain  How  a  simulation  models  terrain  is  important  for  air  defense  weapons 

like  the  divad  and  Stinger,  because  helicopters,  one  of  their  primary 
targets,  can  use  terrain  to  mask  their  intentions  until  moments  before 
they  fire.  The  adage  uses  a  statistical  terrain;  the  Carmonette  and  como 
III  use  a  digitized  map  of  a  geographic  area.  Problems  associated  with 
these  approaches  are  worth  commenting  on.  For  example,  the  adage's 
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statistical  terrain  was  based  on  empirical  data  of  an  extensive  study  of 
World  War  II  tank  battles  that  may  not  represent  the  line-of-site  consid¬ 
erations  appropriate  for  air  defense  in  the  1980’s.  The  model’s  terrain 
does  not  depend  on  the  scenario,  which  can  be  viewed  either  as  a 
strength,  because  the  results  can  be  generalized,  or  as  a  weakness, 
because  the  results  do  not  seem  real. 

Terrain  in  the  adage  was  specified  by  a  distribution  of  unmask-remask 
ranges  that  depended  on  aircraft  altitude,  type  of  terrain,  and  the  height 
of  the  weapon  site  relative  to  the  mean  terrain.  The  terrain  parameter 
specified  only  whether  terrain  was  rough,  rolling,  or  open,  and 
intervisibility  (the  ability  to  see  between  two  points)  is  calculated  with  a 
statistical  model,  given  that  parameter.  For  specific  aircraft  altitudes, 
weapon  heights,  and  flight  paths,  the  mean  unmask  range  was  deter¬ 
mined,  and  random  draws  determined  the  probability  of  unmask  and 
remask  for  each  replication  of  the  Incursion.  Interruptions  in 
intervisibility  were  not  considered,  and  the  aircraft  was  detectable  from 
the  first  unmask  until  remask.  It  should  be  noted,  however,  that  the 
adage  plays  terrain  only  in  the  Incursion  model,  where  it  is  used  in 
developing  the  probability  of  kill;  it  is  not  explicitly  incorporated  in  the 
Campaign,  and  it  is  not  considered  in  the  ground  war. 

In  contrast,  for  the  divad  study,  the  Carmonette  modeled  a  specific  area 
near  Hunfeld,  Germany,  with  terrain  data  from  the  Defense  Mapping 
Agency  and  additional  data  on  vegetation  and  traffic  from  a  waterways 
experiment  station.  Although  this  provided  a  more  realistic  portrayal  of 
terrain,  the  limitation  to  a  single  area  was  viewed  as  a  deficiency,  but  no 
requirement  for  any  other  terrain  was  imposed.  Whether  other  terrain 
would  have  changed  the  conclusions  about  the  divad  is  unknown. 

The  como  III,  like  the  Carmonette,  uses  digitized  data  that  describe  par¬ 
ticular  terrain  areas  in  West  Germany.  Lines  of  visibility  are  determined 
for  each  Stinger  unit  and  the  aircraft  that  may  become  targets.  That  the 
como  III  appropriately  considers  visual  masking  is  important,  because 
many  of  the  Stinger’s  targets  are  aircraft  of  relatively  low  altitude. 

The  adage  and  como  address  the  tactical  environment  reasonably  well, 
whereas  the  Carmonette  is  weak  in  this  area.  Both  the  adage  and  como 
simulate  a  battlefield  of  the  size  appropriate  for  air  defense,  and  both 
simulate  all  the  targets  likely  to  be  encountered  in  air  defense.  The 
adage’s  coverage  of  the  length  of  battle  is  the  more  appropriate  for  air 
defense,  since  its  battle  of  many  days  best  addresses  the  cumulative 
damage  attributable  to  air  attack.  The  adage’s  portrayal  of  terrain 
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allows  generalizations  more  easily  than  that  of  the  Carmonette  or  como 
III  but  it  is  less  realistic.  The  Carmonette’s  strength  regarding  the  envi¬ 
ronment  is  its  ability  to  portray  the  movement  of  ground  weapons,  while 
the  limited  portrayals  in  the  adage  and  como  are  definitely  weaknesses. 


The  Portrayal  of  the 
Weapon  System’s 
Operational 
Performance 


A  complete  model  of  air  defense  weapons  not  only  focuses  on  how  a 
weapon  engages  and  fires  on  enemy  aircraft  but  also  considers  how  that 
weapon  works  with  other  air  defense  weapons  to  maintain  the  ability  of 
ground  forces  to  resist  an  enemy  invasion  on  land.  A  consideration  of 
how  a  weapon  system  operates  in  combat  involves  such  features  as  the 
detection  of  and  engagement  with  its  assigned  targets.  In  air  defense, 
detection  can  be  either  visual  or  by  radar,  either  of  which  can  be 
affected  by  battlefield  obscurants  or  problems  with  command,  control, 
and  communications  as  they  relate  to  identifying  whether  a  potential 
aircraft  target  is  a  friend  or  foe.  A  consideration  of  engagement  involves 
the  physical  characteristics  of  the  air  defense  weapon  system,  the  proce¬ 
dures  of  its  engagment  of  attacking  aircraft,  and  the  application  of  those 
procedures  when  more  than  one  aircraft  is  attacking. 

For  air  defense  weapons,  an  important  aspect  of  modeling  is  how  well 
computer  models  portray  the  way  weapons  detect  and  engage  enemy 
aircraft.  The  important  aspects  of  air  defense  include  radar  and  visual 
detection,  battlefield  obscurants,  battle  management,  iff,  and  command, 
control,  and  communications  as  they  relate  to  iff.  The  important  aspects 
of  engagement  include  the  characteristics  of  a  weapon  affected  by  the 
engagement  procedure  and  the  application  of  those  procedures  to  multi¬ 
ple  aircraft  raids. 


Detection  of  Enemy  In  table  4.5,  we  have  summarized  how  each  of  the  three  simulations  rep- 

Aircraft  resented  the  critical  aspects  of  the  air  defense  mission  related  to  the 

detection  of  enemy  aircraft. 


Visual  Detection  Both  the  adage  and  Carmonette  modeled  how  the  divad  gun  detected 

enemy  aircraft  and  included  provisions  for  visual  detection.  Originally, 
the  adage  used  a  separate  visual  detection  model  called  VISPOE.  devel¬ 
oped  by  the  U.S.  Army  Missile  Command,  and  results  from  this  model 
were  used  as  input  for  the  Incursion  submodel  of  the  adage.  The 
Carmonette  used  the  visual  detection  model  developed  by  the  night 
vision  and  electro-optical  laboratory.  However,  because  differences 
between  VISPOE  and  the  laboratory’s  model  could  not  be  resolved  for 
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the  cost  and  operational-effectiveness  update  and  the  comparative  anal¬ 
ysis,  the  adage  was  modified  to  use  data  from  the  latter  model  for  pop¬ 
up  helicopters,  whereas  the  Carmonette  used  data  from  the  former  for 
the  detection  of  fixed-wing  aircraft  in  the  comparative  analysis. 

In  the  original  adage  analyses,  the  VISPOE  model  incorporated  gradual 
lessenings  of  expected  visibility  to  the  full  range  of  the  divad  gun  by 
extrapolating  limited  Fort  Knox  field  test  data  on  helicopter  detection 
ranges.  In  contrast,  in  the  Carmonette  analyses,  a  ground-to-ground 
detection  model  was  modified  to  include  helicopters;  incorporated  visual 
detection  distances  up  to  only  3  kilometers,  considerably  short  of  the 
divad  gun  range;  and  treated  this  detection  range  as  a  “brick  wall" 
beyond  which  no  visual  detection  could  occur.  Because  of  this  range 
shortfall  and  because  the  Carmonette  analysts  disagreed  with  the  proce¬ 
dure  of  extrapolating  VISPOE  data,  the  Carmonette  analyses  of  the 
divad  used  the  forward-looking  infrared  detection  routine  as  a  proxy  for 
the  visual  detection  of  helicopters  to  the  full  range  of  the  divad  gun.  In 
addition,  the  basic  probabilities  of  detection  assumed  that  the  ground 
observers  in  the  night  vision  and  electro-optical  laboratory  model  had 
infinite  time  in  which  to  detect  targets,  so  the  Carmonette  modelers  had 
to  insert  search-time  limits  in  order  to  keep  the  model  from  accepting 
unrealistically  long  search  times. 

These  two  characteristics — the  divad's  forward-looking  infrared  and 
search-time  limits — were  also  incorporated  into  the  adage  for  the  visual 
detection  of  helicopters.  Since  the  divad  was  not  equipped  with  forward- 
looking  infrared  detection  capability,  its  use  as  a  primary  visual  detec¬ 
tion  model  for  helicopters  resulted  in  a  model  that  did  not  properly  rep¬ 
resent  the  operating  characteristics  of  the  gun.  The  Stinger  model  in  the 
como  allowed  either  the  use  of  a  simple  probability  of  detection  that 
would  be  the  same  for  fixed-wing  aircraft  and  helicopters  or,  alterna¬ 
tively,  the  use  of  tables  showing  the  probability  of  detection  as  a  func¬ 
tion  of  the  type  of  aircraft.  Like  the  Carmonette,  the  Stinger  model  in 
the  como  appears  to  limit  the  visual  detection  search  range  and  impose  a 
“brick  wall." 


Other  aspects  of  visual  detection  important  in  the  tactical  environment 
include  nighttime  vision  and  smoke,  dust,  and  glare.  The  adage  does  not 
model  night  conditions  while  the  Carmonette  does.  The  developers  of  the 
adage  sought  to  include  the  direct  effects  of  smoke,  dust,  and  glare  in 
their  model  but  did  not  do  so,  apparently  because  of  a  lack  of  empirical 
data.  For  the  cost  and  operational-effectiveness  analysis  update,  the 
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adage  was  given  a  provision  to  handle  smoke  the  same  way  it  handles 
bad  weather — indirectly,  by  adjusting  the  input  values  of  the 
probability  of  participation  in  the  ground-to-air  war. 

Using  a  fully  dynamic  detection  model,  the  Carmonette  played  the 
effects  of  smoke,  dust,  fog,  rain,  snow,  and  aerosols.  The  adage  permits 
the  selection  of  weather  conditions  that  determine  the  Incursion  outputs 
that  are  used  as  Campaign  inputs,  but  only  visual-detection  parameters 
are  directly  modeled  in  the  Incursion.  The  como  did  not  directly  play  the 
effects  of  smoke,  dust,  weather,  or  the  time  of  day  or  night.  These  are 
included  indirectly  by  allowing  the  analyst  to  input  degraded  probabili¬ 
ties  and  search  ranges  of  visibility. 


Command,  Communications,  and  Neither  the  adage  nor  the  Carmonette  addresses  command,  control,  and 
Control  and  I  FT  communications  and  ike  directly.  While  documents  concerning 

Carmonette  indicate  some  ability  to  play  command,  control,  and  commu¬ 
nications,  the  Carmonette  studies  of  the  divad  specifically  excluded  their 
effects.  The  adage  gives  some  indirect  consideration  to  command  and 
control  in  its  Incursion  submodel,  because  these  are  considered  in  the 
visual  detection  model  used  in  the  adage.  Any  command  and  control 
effects  on  the  total  battle  are  difficult  to  determine,  however,  since  the 
Incursion  produces  only  one-on-one  attrition  results,  which  are  used  as 
inputs  to  the  Campaign  battle  model. 

Command  and  control  were  not  explicitly  played  in  the  Campaign.  The 
adage  gave  only  indirect  consideration  to  iff  in  the  Incursion  by  includ¬ 
ing  it  as  one  of  several  factors  in  establishing  a  divad  crew  reaction  time 
in  engaging  detected  aircraft.  Whether  this  provision  for  iff  in  the  Incur¬ 
sion  has  any  effect  on  the  battle  in  the  Campaign  is  difficult  to  deter¬ 
mine  since  the  iff  effects  on  reaction  time  are  not  differentiated  from 
any  of  the  other  effects.  Moreover,  the  adage  plays  friendly  air  in  the 
Campaign,  but  the  model  structure  does  not  permit  the  engagement  of 
friendly  air  by  friendly  air  defense  forces,  thus  omitting  a  consideration 
of  the  potential  failure  to  properly  identify  friendly  aircraft.  The  Stinger 
weapon  does  not  require  a  modeling  of  command,  control,  and  communi¬ 
cations,  since  Stinger  teams  are  free  to  engage  other  targets  or  move. 

The  Stinger  model  in  the  como  does  not  model  iff  since  it  does  not  allow 
friendly  aircraft  to  become  potential  targets. 
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Radar  Detection  Both  the  adage  and  Carmonette  provide  for  the  detection  of  enemy  air¬ 

craft  by  radar.  Early  versions  of  the  Carmonette  did  not  correctly  por¬ 
tray  radar-detection  capabilities,  but  changes  produced  a  model  that  is 
probably  superior  to  the  adage  in  this  regard.  In  the  early  stages  of  con¬ 
sidering  the  use  of  the  Carmonette  to  model  the  divad,  objections  were 
raised  because  the  Carmonette  did  not  correctly  play  the  primary  mode 
of  the  divad’s  operation — a  combination  of  radar  and  optics — nor  did  it 
include  the  effect  of  electronic  countermeasures  in  counteracting  the 
divad  radar.  In  addition,  the  Carmonette  originally  was  not  able  to  model 
the  full  detection  capabilities  of  the  divad  radar.  The  Carmonette  was 
modified  to  handle  all  these  problems  for  divad  analyses. 

The  adage  does  not  model  radar  detection  directly;  instead,  it  includes 
radar  effects,  covering  the  gun's  full  range,  in  the  input  data.  The  adage 
matches  the  flight  path  of  approaching  aircraft  against  radar  boundary 
“footprints” — input  data — to  determine  whether  an  aircraft  can  be 
detected  and,  if  so,  when.  The  effects  of  electronic  countermeasures  are 
included  in  determining  the  “footprints.”  This  approach  to  modeling 
radar  was  used  to  produce  a  quick-running  model.  Not  only  does  the 
adage  not  play  radar  detection  directly;  it  also  does  not  portray  how 
aircraft  respond  to  radar  warning.  It  assumes  that  a  flight  path  does  not 
change  when  an  aircraft  is  likely  to  maneuver.  Overall,  therefore,  the 
Carmonette  appears  to  model  radar  detection  better  than  the  adage 
does.  Radar  detection  is  not  applicable  to  the  Stinger. 

In  summary,  none  of  the  models  provides  complete  coverage  of  the 
detection  aspects  of  air  defense.  Visual  detection  is  generally  limited  in 
range.  Command  and  control  and  iff  are  either  not  covered  at  all  or  cov¬ 
ered  only  indirectly.  The  Carmonette  covers  radar  detection  and  battle¬ 
field  obscurants  reasonably  well,  but  tne  adage  and  como  address  them 
only  indirectly,  if  at  all. 


Engagement  of  Enemy  Once  computer  models  indicate  that  air  defense  weapons  have  detected 

Aircraft  enemy  aircraft,  they  must  then  model  how  those  weapons  proceed  to 

engage  and  destroy  enemy  aircraft.  All  the  models  encompass  this 
engagement-and-firing  process,  each  having  strengths  and  weaknesses 
in  its  approach.  In  table  4.6,  we  have  summarized  these  strengths  and 
weaknesses. 


Weapon  Characteristics  The  adage  was  developed  specifically  to  study  the  proposed  divad  gun, 

but  the  Carmonette  originally  based  its  modeling  of  the  divad  on  the 
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capabilities  of  the  Soviet  ZSU-23-4,  an  antiaircraft  gun.  This  version  of 
the  Carmonette  was  used  in  the  antihelicopter  study  that  first  raised 
serious  questions  about  the  effectiveness  of  the  divad.  Disclaimers  in  this 
study’s  report  stated  that  no  conclusions  regarding  the  divad  should  be 
made  because  of  inappropriate  modeling  of  aspects  of  the  divad.  Conse¬ 
quently,  corrections  to  the  Carmonette  were  necessary  and  a  revised 
model  was  used  for  the  1984  cost  and  operational-effectiveness  update. 
Since  the  characteristics  of  the  divad  gun  have  been  similarly  modeled  in 
both  the  adage  and  Carmonette,  we  believe  that  any  further  differences 
probably  result  from  how  the  gun  was  modeled  for  use  in  combat. 

The  COMO  Stinger  model  was  based  on  the  physical  characteristics  of  the 
Stinger  weapon  system  and  its  operational  procedures.  The  physical 
characteristics  can  be  altered  if  the  intention  is  to  evaluate  prospective 
enhancements.  Programming  changes  would  generally  be  required  to 
make  changes  in  operation;  however,  one  feature  is  that  firing  doctrine, 
which  must  be  responsive  to  existing  conditions,  is  selected  in  the  data 
input  phase. 


Roth  the  adage  and  the  Carmonette  model  engagement  procedures.  In 
the  adage,  all  short-range  air  defense  weapons  could  engage  aircraft  in 
the  "fly-by"  mode — that  is,  aircraft  fly  past  the  air  defense  weapon 
enroute  to  another  target — or  the  vicinity-of-target  mode — that  is,  air¬ 
craft  maneuver  during  ordnance  delivery  on  a  target  defended  by  the 
weapon.  However,  the  adage  directly  models  one-on-one  engagements 
only  in  its  Incursion  component,  the  results  of  which  are  used  as  input 
data  in  the  Campaign  many-on-many  expected-value  model. 

The  adage  many-on-many  approach  does  not  properly  account  for  the 
spatial  or  temporal  saturation  of  many  enemy  aircraft  attacking  at  the 
same  time.  Other  aspects  of  the  adage’s  failure  to  model  many-on-many 
engagements  directly  are  (1)  the  adage  does  not  permit  guns  to  switch 
targets;  (2)  the  adage  does  not  allow  the  number  of  aircraft  to  change 
during  segments  of  a  raid;  (3)  the  adage  does  not  handle  the  effect  of 
mission  aborts  properly;  and  (4)  the  adage  assumes  perfect  coordination 
between  air  defense  units  in  seek  ng  and  engaging  the  same  target. 

Even  in  one-on-one  modeling,  ihere  are  problems  with  the  adage's  por¬ 
trayal  of  weapon-aircraft  engagement.  Since  the  Incursion  did  not  model 
duels,  the  divad  could  engage  and  kill  threat  aircraft  but  the  threat  air¬ 
craft  could  not  directly  engage  the  divad.  The  divad’s  attrition  as  a  target 
class  was  played  in  the  Campaign  submodel  and  the  destroyed  guns 
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were  removed  only  at  the  end  of  a  raid.  Thus,  the  divad  could  remain 
operational  to  inflict  damage  when  it  might  otherwise  have  been 
destroyed.  This  approach  is  similar  to  the  attrition  of  enemy  aircraft 
and  is  a  problem  inherent  in  the  expected-value  approach.  In  addition, 
the  adage  definition  of  the  di\ad  target  class  permitted  target  overkill, 
which  resulted  in  the  destruction  of  fewer  numbers  of  the  divad  than  in 
the  Carmonette  for  the  same  number  of  threat  missiles  fired  at  it.  Fur¬ 
thermore.  aircraft  in  the  adage  fly  a  constant  heading  and  altitude  and 
do  not  react  to  ground  fire  or  radar  warning. 

The  Carmonette  plays  rules  of  engagement  but,  unlike  the  adage,  con¬ 
centrates  on  vicinity-of-target  engagements.  In  the  tea  doc  study  advi¬ 
sory  group  discussions  about  adding  fixed-wing  aircraft  to  the 
Carmonette.  reviewers  justified  the  exclusion  of  these  aircraft  by  assert¬ 
ing  that  including  a  fly-by  mode  serves  no  useful  purpose,  since  all  it 
does  is  give  the  divad  more  targets  to  shoot  at  without  any  effect  on  the 
ground  battle  at  the  battalion  level.  Omitting  the  fly-by  mode  appears  to 
ignore  the  divad's  division-level  responsibilities.  While  the  Carmonette 
allows  different  engagement  doctrines,  air  defense  weapons  generally 
commit  to  engage  only  after  their  particular  targets  have  been 
recognized. 

The  Carmonette  provides  for  selection  from  among  several  targets.  It 
gives  priority  to  the  nearest  target  and  then  prioritizes  targets  according 
to  type  and  speed,  starting  with  hovering  helicopters  and  going  on  to 
moving  helicopters  and  fixed-wing  aircraft.  Some  concern  was 
expressed  about  this  order.  The  Carmonette  simulates  the  divad's  ability 
to  continuously  track  multiple  targets,  retaining  a  track  file  for  future 
engagements  and  continuously  updating  it  with  prioritized  targets.  The 
model  did  not  play  fire  distribution  command  and  control,  so  the  divad, 
which  moved  in  pairs,  could  fire  from  the  two  guns  on  the  same  target. 

The  Carmonette  models  helicopters,  including  their  reaction  to  radar 
warning  and  gunfire,  but  in  1984,  it  did  not  model  fixed-wing  threats. 
For  the  1985  comparative  analysis,  the  U.S.  Army  Material  Systems 
Analysis  Activity  provided  fixed-wing  aircraft  data  relevant  for  the 
divad  in  the  form  of  tables  generated  by  a  gun-effectiveness  model  simi¬ 
lar  to  the  adage.  More  recently,  a  fixed-wing  aircraft  submodel  has  been 
added  that  allows  preset  flight  paths  with  varying  heading  and  altitude 
but  does  not  alter  the  flight  path  in  response  to  radar  warning  and 
gunfire. 
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Since  the  Carmonette  is  an  event -sequenced  Monte  Carlo  model,  it  mod¬ 
els  each  engagement  between  an  air  defense  weapon  and  an  aircraft  as  it 
occurs.  Attrition  occurs  after  an  engagement  between  an  aircraft  and  an 
air  defense  weapon  rather  than  at  fixed  points  in  time,  as  in  determinis¬ 
tic  models  like  the  adage.  It  should  be  remembered,  however,  that  the 
Carmonette  models  only  a  battalion-level  rather  than  a  division  level 
battle,  like  the  adagk.  and  it  models  only  4  divad  guns,  compared  to  the 
adage’s  36. 

The  como  III  provides  extensive  detail  of  how  weapon  systems  engage 
their  targets.  Like  the  Carmonette,  it  includes  the  coverage  of  multiair¬ 
craft  raids.  Like  the  adage,  the  como  permits  the  engagement  of  all 
targets,  in  contrast  to  the  Carmonette,  which  ignores  aircraft  flying 
through  the  battle  area  to  and  from  deeper  battle  zones.  The  effect  of  a 
saturation  level  of  aircraft  attacking  an  area  defended  by  Stinger  teams 
can  be  demonstrated.  The  separate  and  overall  effects  of  Stinger  and  the 
air  defense  weapon  types  can  also  be  shown. 

To  what  extent,  then,  did  the  three  simulations  we  reviewed  appropri¬ 
ately  characterize  the  critical  aspects  of  air  defense  weapons?  We  looked 
at  specific  aspects  of  the  modeling  of  air  defense  under  three  broad 
areas  of  coverage — weapons  system  environment,  detection  of  enemy 
aircraft,  and  engagement  with  enemy  aircraft.  We  found  that  all  the 
models  had  significant  weaknesses  in  at  least  one  of  these  general  areas. 
Only  one — the  como — completely  modeled  even  one  of  the  general  areas 
of  interest.  The  adage  was  generally  weak  in  its  portrayal  of  the  detec¬ 
tion  of  enemy  aircraft;  the  Carmonette  was  weak  in  its  portrayal  of  the 
weapon-system  environment.  The  como  provided  reasonably  complete 
modeling  of  the  engagement  of  air  defense  weapons  with  attacking 
aircraft. 


The  description  of  a  weapon’s  tactical  environment  should  be  complete 
enough  to  cover  all  the  critical  variables  in  the  otal  war  that  might 
affect  the  behavior  of  the  weapon.  Tn  the  three  models,  we  found  differ¬ 
ing  approaches  to  various  aspects  of  modern  warfare  and  their  interac¬ 
tion.  In  table  4.7,  we  have  summarized  the  coverage  in  the  three 
simulations. 

The  air  defense  tactical  arena  includes  air  war,  ground  war,  and  the 
interaction  of  the  two.  Air  defense  artillery  provides  support  for  tacti¬ 
cal,  operational,  and  strategic  warfare.  Its  mission  is  to  nullify  or  reduce 
the  effectiveness  of  attack  or  surveillance  by  hostile  aircraft  or  missiles 
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after  they  are  airborne,  thereby  supporting  the  Army’s  primary  func¬ 
tion  of  conducting  prompt  and  sustained  land  warfare  operations.  Short- 
range  air  defense  and  artillery  units  engage  enemy  close-air-support 
helicopters  and  fixed-wing  aircraft  and  engage  ground  targets  in  self- 
defense  when  conflict  with  enemy  ground  forces  is  intense.  Therefore, 
simulations  appropriate  for  studying  the  effects  of  air  defense  weapons 
should  cover  fixed-wing  and  helicopter  targets  as  well  as  the  general 
effects  of  the  ground  war. 


Air  War  Throughout  much  of  the  Carmonette’s  modeling  effort  with  the  divad,  it 

failed  to  model  one  of  the  gun’s  primary  targets — fixed-wing  aircraft.  A 
1983  Carmonette  study  that  originally  raised  questions  about  the  effec¬ 
tiveness  of  the  divad,  the  antihelicopter  study,  did  not  include  fixed- 
wing  aircraft  as  an  attacker  and  a  potential  target.  Despite  this  concern, 
the  study  advisory  group  did  not  require  the  Carmonette  modelers  to 
develop  fixed-wing  model  coverage,  acknowledging  that  they  did  not 
have  sufficient  time  to  meet  deadlines.  By  the  time  of  the  1985  compara¬ 
tive  analysis,  Carmonette  analyses  did  cover  enemy  fixed-wing  aircraft, 
but  friendly  fixed-wing  counterair  and  iff  were  not  included.  Previous 
concerns  about  the  failure  to  address  friendly  close  air  support  do  not 
appear  to  have  been  addressed. 

From  the  beginning,  the  adage  noted  the  importance  of  fixed-wing  air¬ 
craft  to  the  battle  and  included  almost  all  aspects  of  fixed-wing  air  play, 
omitting  only  the  effects  of  friendly  close  air  support.  Although  the 
adage  recognized  the  need  for  iff  in  determining  gun  reaction  time,  it  did 
not  play  iff  directly  in  its  portrayal  of  the  air  defense  war.  Rather,  the 
air-to-air  war  was  a  separate  component  of  the  model  and  was  played 
only  for  egressing  enemy  aircraft — that  is,  friendly  aircraft  could  be 
killed  by  enemy  aircraft  only  after  the  air-to-ground,  ground-to-air,  and 
ground-to-ground  battles  had  been  played.  This  procedure  did  not  per¬ 
mit  friendly  aircraft  to  become  a  target  for  friendly  air  defense.  Other 
aspects  of  the  adage  that  limited  its  portrayal  of  the  air  war  included  ( 1 ) 
sequential  rather  than  simultaneous  multiple  enemy  air  raids,  (2)  inap¬ 
propriate  treatment  of  saturation  attacks,  (3)  perfect  intelligence  in 
enemy  air-raid  planning,  and  (4)  the  uniform  distribution  of  air  defense 
weapons  in  a  division  defense  zone. 

Because  the  como  was  developed  primarily  for  tactical  air  defense  sys¬ 
tems,  it  has  always  given  particular  attention  to  modeling  ground-based 
air  defense  weapons  versus  aircraft,  but  it  includes  a  detailed  model  of 
the  air  war.  A  simulation  can  be  as  simple  as  playing  the  Stinger  weapon 
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system  against  a  single  type  of  aircraft  or  as  complicated  as  playing  a 
fully  formed  defense  at  the  divisional  or  theater  level  against  diverse  air 
attack  scenarios  that  may  include  helicopters,  various  fixed-wing  attack 
aircraft,  and  other  supporting  aircraft.  All  these  abilities  are  external  to 
the  Stinger  submodel.  Although  air  defense  appears  to  be  modeled,  the 
fratricide  of  friendly  aircraft  by  ground-based  air  defense  is  not 
included. 


Ground  War  Since  air  defense  weapons  interact  with  the  ground  war,  the  complete 

modeling  of  air  defense  weapons  should  include  coverage  of  the  ground 
war  to  determine  both  the  effects  on  the  primary  mission  of  air  defense 
weapons  and  the  survivability  of  the  air  defense  weapons  themselves. 
The  Carmonette  is  an  event-sequenced,  fully  computerized  simulation  of 
ground  combat.  All  combined  arms  are  included:  infantry  (mounted  or 
dismounted),  artillery  and  mortars,  armored  vehicles,  and  helicopters. 
The  Carmonette  can  model  movement,  target  acquisition,  firing,  damage 
assessment,  and  communications.  Resupply  and  evacuation,  however, 
are  not  covered. 

The  adage  does  not  model  the  ground  war  dynamically  but,  rather,  plays 
ground  battle  attrition  external  to  its  Campaign  submodel.  Ground  battle 
damage  to  ground  targets,  including  air  defense  weapons,  is  input  in  the 
form  of  externally  generated  attrition  rates.  Ground-target  attrition 
rates  vary  by  target  class  and  day  of  the  war,  while  air  defense  weapon 
attrition  rates  vary  by  type  of  weapon,  air  defense  zone,  and  day  of  the 
war.  Loss  of  ground  targets  is  determined  by  applying  ground  battle 
attrition  rates,  and  ground  losses  are  assessed  prior  to  each  enemy  air 
raid  each  day.  Ground  war  damage  to  air  defense  weapons  is  distributed 
equally  among  all  weapons  of  the  same  type  to  maintain  uniform  den¬ 
sity  of  air  defense  coverage. 

Moreover,  attrition  rates  are  independent  of  air-to-ground  damage.  The 
study  advisory  group  was  concerned  about  the  ground  war  attrition 
input  data  but  could  not  decide  upon  the  most  appropriate  scenario  for 
generating  input  data.  Compounding  this  problem  was  the  group’s  deter¬ 
mination  that  there  was  no  known  relationship  between  an  adage  battle 
day  and  a  battle  day  in  the  scenario  being  used  to  generate  attrition 
data.  Kven  though  some  advocates  of  the  adage  believe  that  complete 
coverage  of  the  ground  war  is  not  necessary  to  study  the  relative  effec¬ 
tiveness  of  air  defense  weapons,  the  study  advisory  group  directed  that 
ground  war  attrition  be  a  part  of  the  adage  model.  Although  dissatisfac¬ 
tion  with  the  adage  ground  war  attrition  rates  had  been  expressed,  the 
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acting  director  of  the  tradoc  studies  and  analysis  directorate  stated  that 
there  was  nothing  basically  wrong  with  the  adage  model  and  all  that  it 
required  were  reasonable  inputs.  It  seems,  then,  that  if  the  ground  war 
scenario  problems  can  be  solved,  the  concerns  may  be  dispelled  about 
the  adage’s  portrayal  of  the  ground  war. 

Unlike  the  Carmonette  and  adage,  the  como  does  not  simulate  interac¬ 
tions  between  ground  forces  and,  thus,  does  not  measure  ground  battle 
damage  to  either  air  defense  weapons  or  any  other  ground  target.  To  the 
extent  that  air  defense  weapons  should  be  threatened  by  ground  attack, 
the  realism  of  the  como  modeling  approach  is  diminished.  However,  to 
the  extent  that  the  scenario  avoids  playing  the  forward  edge  of  the  bat¬ 
tle  area  or  establishes  a  scenario  in  which  ground  attack  is  not  a  fac¬ 
tor — such  as  air  base  attack — then  the  absence  of  the  portrayal  of 
ground  attack  is  not  critical. 


The  Interaction  of  Air  and 
Ground  Wars 


The  Campaign  submodel  of  the  adage  uses  externally  generated  attrition 
rates  for  the  ground-to-ground  war.  It  uses  probability-of-destruction 
input  from  the  munitions  effectiveness  subgroup  of  the  Joint  Technical 
Coordinating  Group’s  survivability  program  to  calculate  air-to-ground 
attrition.  The  ground-to-ground  and  air-to-ground  damage  calculations 
are  separate  subroutines  and  do  not  interact.  The  portrayal  of  air-to- 
ground  damage  considers  such  factors  as  ground  target  class,  number  of 
targets  in  that  class,  total  number  of  raids  in  an  air  wave  attack,  the 
assignment  of  those  raids  to  targets,  ordnance  loadings,  the  probability 
of  locating  assigned  targets,  and  probabilities  of  destruction  that,  com¬ 
bined  with  aircraft  probability-of-survival  factors,  produce  a  parameter 
called  the  fraction  by  which  targetable  elements  are  to  be  reduced.  This 
procedure  produces  average  damage  for  all  targets  in  a  class  rather  than 
damage  to  specific  targets. 

The  adage  documentation  indicates  that  this  procedure  may  lead  to 
overestimating  air-to-ground  damage  in  certain  cases.  The  adage's 


Models  of  air  defense  should  allow  the  air  and  ground  combat  to  interact 
in  a  reasonable  manner.  The  Carmonette  treats  events  dynamically,  but 
the  adage  allows  no  dynamic  interaction  between  losses  from  ground 
fire  and  air  attack.  The  adage  calculated  all  ground  damage,  whether 
caused  by  the  ground  war  or  air  attacks,  by  applying  attrition  rates  to 
ground  assets.  The  assessment  of  losses  was  calculated  between  waves 
of  air  raids  rather  than  during  them,  and  damage  depended  on  the  type 
of  target,  among  other  things. 
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approach  to  the  air-to-ground  war  produced  results  that  its  critics  call 
unexplainable,  unconvincing,  and  disconnected  from  reality  and  that 
resulted  in  an  attempt  to  require  that  they  be  made  consistent  with 
other  tradoc  studies.  This  consistency  was  to  be  considered  not  equiva¬ 
lence  but  reasonable  agreement  with  attrition  results  developed  in  other 
models.  The  study  advisory  group  suggested  that  consistency  might  be 
obtained  by  having  the  adage  use  an  air  threat  similar  to  that  used  in 
the  SCORES  V  scenario. 

In  this  connection,  even  the  the  adage’s  critics  say  that  the  model  is  use¬ 
ful  for  air  defense  weapon-system  comparisons  and  that  its  per-raid 
attrition  did  not  differ  much  from  the  Carmonette’s  per-raid  attrition;  it 
was  the  accumulation  of  attrition  over  multiple  raids  that  caused  prob¬ 
lems.  The  results  being  questioned — especially  damage  by  fixed-wing 
aircraft — could  not  be  resolved  by  the  Carmonette’s  results  until  the 
Carmonette  played  fixed-wing  for  the  comparative  analysis  of  1985. 
Even  then,  battalion  rather  than  division  portrayal  raised  questions  of 
the  appropriateness  of  comparing  the  Carmonette’s  results  to  those  of 

the  ADAGE. 

Proponents  of  the  adage  assert  that  it  is  appropriate  for  comparing  air 
defense  weapon  systems  even  though  the  air-to-ground  damage  results 
may  be  “too  high.”  They  state  that  accurate  numbers  are  not  necessary 
when  comparing  the  relative  effects  of  different  systems.  Even  its  crit¬ 
ics  agree  that  the  adage  produced  similar  results — major  damage  by 
enemy  air — no  matter  how  many  excursions  were  run.  These  attrition 
rates,  which  were  considered  excessive,  cannot  be  overlooked,  but  the 
consistency  of  air  damage  to  ground  targets  using  different  weapon-sys¬ 
tem  combinations  in  the  adage  cannot  be  overlooked  either.  Further 
examination  of  the  aircraft  damage  to  ground  assets  appears  warranted. 

The  only  way  ground  assets  are  damaged  or  destroyed  in  the  como  is  by 
air  attack.  These  assets  in  como  modeling  are  often  air  defense  weapons, 
although  other  ground-based  assets  may  be  included.  Loss  of  ground 
targets,  like  all  attrition  in  the  como,  is  played  probabilistically.  The 
destruction  of  ground  assets  depends  on  successful  attack  by  and  sur¬ 
vival  of  particular  threat  aircraft. 

How  well  do  the  models  we  reviewed  address  the  critical  aspects  of  the 
combat  arena  in  which  the  weapon  system  is  to  be  used?  All  the  models 
have  weaknessess  in  the  portrayal  of  at  least  one  critical  aspect  of  the 
air  defense  combat  arena.  The  adage  and  como  give  inadequate  consider¬ 
ation  to  the  effects  of  ground  war  activities  on  air  defense  weapons,  and 
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they  do  not  completely  portray  the  interaction  of  air  and  ground  activi¬ 
ties.  The  Carmonette’s  treatment  of  the  air  war  is  incomplete,  since  it 
continually  failed  to  include  fixed-wing  aircraft  effects  and  only 
recently  addressed  these  aircraft,  even  indirectly.  The  strength  of  the 
adage  and  como  lies  in  the  portrayal  of  the  air  war,  while  the 
Carmonette’s  strength  is  its  good  portrayal  of  ground  activities. 


Mathematical  and 
Logical 

Representations 


Another  critical  area  of  concern  in  modeling  the  operational  effective¬ 
ness  of  weapon  systems  is  how  the  theory  and  the  phenomena  are  math¬ 
ematically  and  logically  represented.  As  we  have  summarized  in  table 
4.8,  three  areas  of  concern  about  the  adage  are  the  expected-value 
approach  in  the  Campaign  for  modeling  engagements  of  multiple  air 
defense  weapons  against  multiplane  attacks,  its  use  of  the  probability  of 
participation  of  air  defense  weapons,  and  its  apparent  exaggeration  of 
the  divad’s  survivability. 

The  adage  does  not  account  for  the  spatial  or  temporal  saturation  of 
enemy  aircraft — that  is,  many  attacking  at  the  same  time.  Rather,  it 
uses  an  expected-value  approach,  in  which  the  probability  of  aircraft 
survival  in  a  many-on-many  raid  is  based  on  crossproducts  of  simple 
exponential  expansions  of  the  basic  one-on-one  survival  probabilities  of 
individual  air  defense  weapon  systems.  Some  authorities  believe  this 
approach  is  severely  flawed  because  its  results  are  simple  extrapola¬ 
tions  of  one-on-one  free-encounter  attrition  factors  and  ignore  the  total¬ 
ity  of  a  configured  many-on-many  encounter  with  its  many  potential 
interactions.  These  extrapolations  suppress  the  stochastic  or  probabilis¬ 
tic  effects  of  many-on-many  encounters,  because  to  treat  them  analyti¬ 
cally  in  an  expected-value  approach  is  unmanageably  complex.  Even  a 
small  engagement  of  10  weapons  versus  10  aircraft  requires  more  than 
10  million  analytical  steps. 

Therefore,  it  is  not  possible  to  relate  the  analytic  equations  to  the  spe¬ 
cific  parametric  performance  of  a  given  weapon  or  to  relate  that  per¬ 
formance  to  lower-level  decisions  and  engagement  rules.  The  analytical 
approach  relies  on  the  use  of  expected  values  to  represent  the  behavior 
of  random  processes,  and  many  of  the  possible  variations  thereby  lost 
are  adequate,  of  themselves,  to  materially  alter  the  course  of  the  battle 
and  destroy  the  relationships  and  effects  being  investigated.  A  complete 
Monte  Carlo  approach  to  modeling  is  generally  recommended. 

Aggravating  this  basic  unsoundness  of  the  adage  is  the  process  used  to 
determine  the  number  of  air  defense  weapons  to  be  used  in  the  ingress- 
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egress  portion  of  a  many-on-many  raid.  The  number  of  air  defense 
weapons  encountered  by  enemy  aircraft  is  strongly  influenced  by 
another  parameter — the  probability  of  an  air  defense  weapon  partici¬ 
pating  Li  the  defense  against  enemy  aircraft.  Determining  the 
probability  of  an  air  defense  weapon  participating  in  the  air  battle  starts 
with  several  assumptions:  (1)  the  gunner  has  survived,  (2)  the  system  is 
operational,  and  (3)  the  gunner  and  the  system  are  in  the  right  place  at 
the  right  time.  Since  the  adage  does  not  play  the  ground  war  dynami¬ 
cally  but  assesses  damage  to  air  defense  weapons  through  “bookkeep¬ 
ing"  routines  that  account  for  damage  at  the  end  of  a  wave  of  aircraft 
raids,  all  weapons  available  at  the  beginning  of  a  raid  are  presumed  to 
be  available  throughout  that  specific  wave.  This  could  overstate  the 
total  number  of  weapons  available  within  a  wave. 

Once  these  assumptions  are  accepted,  however,  the  probability  that  any 
air  defense  weapon  will  participate  in  the  air  battle  becomes  a  function 
of  the  weapon  type,  the  zone  in  which  the  weapon  is  deployed,  the  type 
of  attacking  aircraft  being  engaged,  and  whether  the  raid  is  ingressing. 
attacking  the  target,  or  egressing.  (“Zone”  refers  to  the  fact  that  the 
adage  partitions  the  division  into  four  zones  parallel  to  the  area  of  the 
forward  edge  of  battle.)  Some  of  the  factors  depend  on  tactics  and  doc¬ 
trine,  the  tactical  situation,  the  commander’s  guidance,  and  the  intensity 
of  the  ground  battle.  Specific  considerations  are  the  operational  availa¬ 
bility  of  the  gun,  the  suppression  that  may  have  taken  place,  the  move¬ 
ment  of  air  defense  weapons,  smoke  and  dust  conditions,  and  raid 
saturation.  A  systematic  mixing  of  all  these  considerations  results  in  a 
set  of  probabilities  of  participation  for  each  type  of  air  defense  weapon 
against  each  type  of  aircraft. 

These  probabilities  anticipate  likely  participation  by  the  divad  except 
against  ingressing  and  egressing  targets  2.5  to  5  kilometers  behind  the 
forward  edge  of  battle.  The  concept  of  probability  of  participation  was 
not  clearly  understood  in  the  simulation,  and  the  first  cost  and  opera¬ 
tional-effectiveness  analysis  on  the  divad,  which  was  based  on  the 
adage,  indicated  that  the  probabilities  of  participation  might  be  optimis¬ 
tic.  Although  one  of  the  reasons  cited  for  using  the  Carmonette  in  the 
1984  divad  update  analyses  was  to  shed  light  on  this  parameter,  we  were 
informed  that  this  subject  was  not  studied,  and  no  relevant  information 
was  discussed  in  the  update  report. 

Another  area  of  concern  relates  to  the  adage's  definition  of  target  sets, 
which  led  to  an  apparent  exaggeration  of  the  divad's  survivability.  The 
adage  does  not  model  direct  attacks  by  aircraft  on  the  divad  itself,  since 
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it  does  not  model  duels.  Instead,  the  attrition  of  the  weapon  was  played 
in  the  Campaign,  which  uses  expected-value  equations  to  calculate  the 
probability  of  damage  to  ground  targets  by  class  from  air  attacks  and 
assumes  a  random  selection  of  targets  within  one  target  class.  Similar 
procedures  were  used  to  assess  damage  to  imvad  weapons  in  the  ground 
war. 

This  approach  led  to  a  problem  in  which  the  divad  was  labeled  the 
"immortal  divad.”  adage  results  implied  that  it  took  10  times  the  number 
of  air-to-ground  missiles  indicated  by  the  Carmonette  to  kill  one  divad. 
Analysis  by  the  study  advisory  group  indicated  that  classifying  the 
divad  in  a  target  class  by  itself  caused  the  adage  model  to  shoot  all  the 
helicopter  missiles  for  the  class  at  the  one  divad;  hence,  the  problem  was 
one  of  target  overkill  rather  than  of  the  divad’s  survivability.  The  cor¬ 
rection  of  this  problem — reclassifying  the  divad  into  a  tank-mechanized 
vehicle  target  set — was  a  source  of  discomfort  to  the  study  advisory 
group,  because  this  implied  a  change  in  enemy  helicopter  firing  priority. 

The  Carmonette  also  had  problems  with  mathematical  and  logical  repre¬ 
sentations.  Even  though  its  proponents  asserted  that  the  mathematics  of 
the  model  was  rather  simple  and  straightforward,  early  attempts  to 
model  the  divad  included  at  least  one  basic  mathematical  error.  Early  in 
the  Carmonette's  use,  reviewers  from  the  1T.S.  Army  Air  Defense  Artil¬ 
lery  School  discovered  that  the  Carmonette  routines  were  incorrectly 
squaring  a  probability-of-kill  parameter  in  its  gun  submodel.  This  would 
obviously  distort  the  effectiveness  results  but  was  corrected  for  the 
Carmonette  analyses  of  the  divad. 

A  logical  consideration  involving  the  Carmonette’s  application  of  Monte 
Carlo  techniques  that  was  of  concern  to  the  same  reviewers  was  the  pro¬ 
cedure  used  to  generate  random  numbers  for  various  randomly  occur¬ 
ring  events  in  the  model.  The  Carmonette  generated  random  numbers 
only  once,  at  the  begining  of  the  run;  it  used  the  same  random  numbers 
throughout  the  run.  For  example,  the  degree  to  which  detection  sensors 
would  be  degraded  by  enemy  electronic  countermeasures  was  selected 
randomly  at  the  beginning  and  used  throughout  the  entire  run.  It  is  rea¬ 
sonable  to  assume,  even  for  the  short  battles  that  are  modeled  in  the 
Carmonette,  that  the  effects  of  electronic  countermeasures  would  vary 
and  that  a  better  representation  of  them  should  have  been  modeled. 

Finally,  the  Carmonette  did  a  reasonably  good  job  of  modeling  the 
dynamic  interactions  of  multiple  aircraft  against  multiple  air  defense 
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weapons,  but  the  probabilities  of  killing  fixed-wing  aircraft  were  deter¬ 
mined  with  a  procedure  basically  similar  to  tnat  used  in  the  adage.  The 
model  primarily  addressed  one-on-one  engagements  in  a  few-on-few  con¬ 
text  and  used  the  same  approach  discussed  earlier  to  determine  the  kill 
probabilities  applicable  to  a  multiaircraft,  multiweapon  context.  This 
opens  the  Carmonette  to  some  of  the  same  criticisms  applicable  to  the 
adage  for  fixed-wing  aircraft. 

In  the  como,  weapons  are  unavailable  for  further  use  as  soon  as  they  are 
destroyed  by  aircraft  attack.  Similarly,  the  availability  of  weapons  to 
engage  target  aircraft  is  limited  to  the  actual  capacity  constraints  of 
communications  channels,  launchers,  radar,  and  so  on.  The  “bookkeep¬ 
ing"  capabilities  of  the  como  are  constantly  in  use  to  determine  the 
resources  that  are  available  and  whether  the  operation  of  the  system  is 
possible.  If  threat  aircraft  did  not  come  within  range  of  a  Stinger  unit, 
the  unit  would  not  be  engaged,  regardless  of  how  many  threat  aircraft 
were  saturating  an  adjacent  area.  The  como  thus  avoids  the  pitfalls  of 
the  expected-value  approach.  In  return,  it  requires  realistic  scenarios, 
not  scenarios  that  have  been  specifically  developed  to  take  advantage  of 
the  model’s  limitations. 

How  appropriate  are  the  mathematical  and  logical  representations  used 
in  the  three  models?  The  expected-value  approach  of  the  adage  is 
severely  flawed  in  determining  the  effects  of  multiaircraft,  multiweapon 
engagements.  While  the  Carmonette’s  Monte  Carlo  approach  alleviates 
some  of  these  problems,  its  basic  mathematical  formulations  of  fixed- 
wing  aircraft  engagements  are  the  same  as  those  of  the  adage.  More¬ 
over,  both  of  these  models  have  other,  less  serious  mathematical  and 
logical  problems  that  threaten  the  credibility  of  the  results.  Only  the 
como  appears  to  be  free  of  serious  problems. 


The  Input  Sources 


We  have  noted  the  appropriateness  of  input  factors  throughout  the  dis¬ 
cussion.  In  assessments  of  the  credibility  of  simulations,  data  considera¬ 
tions  are  important,  since  even  the  best  theoretical  model  produces 
noncredible  results  if  it  is  based  on  faulty  input  data.  Since  the  whole 
simulation  can  falter  when  input  data  are  not  clearly  relevant,  complete 
information  about  the  data  is  necessary.  In  table  4.9,  we  have  summa¬ 
rized  the  more  critical  aspects  of  input  data  for  the  three  simulations. 


Data  Sources  All  the  models  used  data  developed  by  recognized  sources.  The  adage 

documentation  cited  the  tac'ical  air  division  of  the  office  of  secretary  of 
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Defense  for  planning  and  evaluation  as  its  primary  data  source.  Damage 
to  ground  targets  by  enemy  aircraft — even  though  a  source  of  criticism 
for  producing  unconvincing  results — was  based  on  the  data  and  method¬ 
ology  from  the  joint  munitions  effectiveness  manual  of  the  U.S.  Army 
Material  Systems  Analysis  Activity,  which  also  supplied  weapon-system 
characteristics,  as  did  the  weapon-systems  project  managers  and  the 
Army  Material  Development  and  Readiness  Command.  Ground  battle 
data  came  from  the  Combined  Arms  Combat  Development  Activity; 
visual  detection  data  came  from  the  U.S.  Army  Missile  Command  and 
the  Night  Vision  and  Electro  Optics  Laboratory  in  Fort  Belvoir,  Virginia. 
Some  of  the  data  came  from  the  Army  Air  Defense  Artillery  School. 

All  these  are  typical  data  sources  for  dod  simulations.  Even  when  the 
study  advisory  group  expressed  concern  about  input  values  from  these 
sources,  it  had  difficulty  recommending  more  appropriate  sources.  With 
respect  to  visual  detection,  however,  it  should  be  noted  that  the  labora¬ 
tory’s  sources  were  used  in  the  adage  to  detect  pop-up  helicopters,  prin¬ 
cipally  because  the  Carmonette’s  modelers  would  not  accept  extra¬ 
polations  of  the  VISPOE  results  from  the  missile  command,  even  though 
they  adopted  its  methodology  for  visual  detection  in  their  modeling  of 
fixed-wing  aircraft,  since  the  laboratory’s  method  applied  only  to 
helicopters. 

The  input  data  sources  for  the  Carmonette  included  the  Defense  Map¬ 
ping  Agency  for  terrain  data,  the  Atmospheric  Science  Laboratory  for 
smoke  and  dust  considerations,  the  Night  Vision  and  Electro  Optics  Lab¬ 
oratory  and  the  Army  Missile  Command  for  visual  detection,  and  the 
Army  Material  Systems  Analysis  Activity  for  weapons  characteristics 
and  lethality  data.  The  waterways  experimentation  station  and  the 
Tank  and  Automotive  Command  were  the  source  of  ground  vehicle 
mobility  information.  These  appear  to  have  been  appropriate  data 
sources.  The  Carmonette  depends  on  input  from  the  users  of  the  model 
for  a  description  of  the  processes  to  be  simulated,  and  since  weapon  sys¬ 
tems,  terrain,  and  time  are  explicitly  modeled,  there  is  no  inherent  limi¬ 
tation  on  what  it  can  simulate. 

The  como  uses  some  of  the  sources  that  the  adage  and  Carmonette  use 
and  some  that  are  different.  Like  the  adage,  it  receives  detection  data 
from  the  Army  Missile  Command.  Like  the  Carmonette.  it  uses  terrain 
data  from  the  Defense  Mapping  Agency.  Like  both  the  adage  and 
Carmonette,  it  uses  lethality  data  provided  by  the  Army  Material  Sys¬ 
tems  Analysis  Activity.  Scenario  information  comes  from  the  tradoc 
Systems  Analysis  Activity  and  the  Army  Air  Defense  Artillery  School. 
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Additional  scenario  data  from  the  Concepts  Analysis  Agency  were  used. 
Weapon-system  characteristics  were  provided  by  the  Army  Missile  Com¬ 
mand  for  friendly  weapons  and  the  Intelligence  Security  Command  for 
enemy  weapons.  The  como's  data  sources  appear  to  have  been 
appropriate. 

The  simulations  shared  some  data  sources,  especially  for  lethality  data, 
but  each  model  also  had  unique  data  sources.  Since  some  of  the  sources 
did  differ,  there  is  always  the  possibility  of  differing  qualities  of  data 
inputs  across  the  models.  One  such  area  was  visual  detection,  for  which 
the  Army  has  not  yet  resolved  disputes  concerning  the  data. 


Data  Quality  The  appropriateness  and  structure  of  data  for  use  in  a  particular  simu¬ 

lation  can  be  a  source  of  concern.  If  the  data  are  basically  inappropriate 
or  problems  arise  from  structuring  the  data  for  use  in  a  simulation,  the 
simulation’s  results  may  not  be  well  accepted. 

The  adage  produced  results  that  were  unconvincing  to  some  potential 
users  and  some  data  items,  such  as  target  damage  tables,  were  thought 
by  the  adage  modelers  to  yield  overestimates  of  damage  in  some  cases. 
The  terrain  data  were  recognized  as  old,  and  ground-war  attrition  rates 
concerned  the  study  advisory  group,  which  also  considered  target  mili¬ 
tary-worth  data — used  in  the  adage  to  measure  the  worth  of  unlike 
targets  such  as  tanks  and  air  defense  weapons  and,  therefore,  directly 
related  to  the  adage’s  measures  of  effectiveness — to  be  consistent  with 
similar  data  used  in  other  models.  All  these  elements  taken  together 
probably  led  to  the  conclusion  of  one  tradoc  official  that  there  was 
nothing  wrong  with  the  adage  program — all  it  needed  was  reasonable 
inputs.  In  defense  of  the  adage,  its  proponents  asserted  that  even 
though  some  data  elements  that  related  air  damage  to  ground  targets 
might  be  too  high,  they  were  alright  for  the  adage's  purpose,  which  was 
to  compare  competing  weapon  systems.  Correct  relative  values  are  suffi¬ 
cient  for  this,  and  correct  absolute  values  are  not  necessary. 

Visual  detection  ranges  were  a  source  of  serious  disagreement  between 
the  adage  and  Carmonette  modelers.  The  compromise,  which  was  to  use 
results  from  modeling  the  divad  with  forward-looking  infrared  sensors 
for  long-range  searches,  resulted  in  the  use  of  inaccurate  data  to  accom¬ 
modate  a  correct  theory  (coverage  of  the  gun's  full  range)  and  points  out 
the  need  to  establish  data  sources  that  will  measure  visual  detection 
over  the  full  range  of  a  weapon  without  including  weapon  characteris¬ 
tics  in  input  data  that  do  not  exist. 
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One  aspect  of  the  adage  data-handling  requires  special  attention — how 
the  adage  models  weapon  characteristics  in  its  Incursion  component. 
Because  of  the  complexity  and  uniqueness  of  weapon-systems  input 
data,  the  adage  modelers  wrote  the  weapon-system  characteristics  into 
the  model  rather  than  addressing  them  through  an  external  data  base. 
This  is  contrary  to  requirements  suggested  by  the  Joint  Forward  Area 
Air  Defense  Test  Force  and  was  considered  a  weakness  by  the  test  force 
reviewers.  In  addition,  since  many  of  the  data  elements  were  classified, 
these  reviewers  were  not  able  to  review  the  documentation  for  the 
Incursion  component  that  contained  the  computer  program.  While 
changes  to  the  program  could  be  made,  the  adage  modelers  required  new 
data  of  appropriate  format  for  the  model.  Changes  to  the  computer  code 
of  a  model,  even  though  supposedly  limited  to  data  elements,  always 
carry  the  risk  of  uncontemplated  changes  to  the  program  itself.  This  is  a 
legitimate  concern  that  nevertheless  seems  secondary  compared  to  the 
problems  with  the  basic  data  values  themselves. 

Overall,  the  adage  input  values  are  a  source  of  concern  clouding  its  gen¬ 
eral  acceptability.  At  the  same  time,  however,  the  adage’s  basic 
approach — comparing  different  weapon  systems  competing  in  the  same 
functional  area — should  be  carefully  considered  before  unnecessarily 
stringent  data  requirements  are  imposed. 

The  Carmonette  depends  on  the  user’s  input  for  a  description  of  the  pro¬ 
cess  to  be  simulated.  Since  weapon  systems,  terrain,  and  time  are  explic¬ 
itly  modeled,  there  is  no  inherent  limitation  on  what  the  Carmonette  can 
simulate.  Its  input  structure  allows  considerable  flexibility  but  also 
places  the  burden  of  obtaining  realistic  simulations  on  the  analyst  and 
requires  extensive  effort  in  data  preparation.  We  have  already  dis¬ 
cussed  several  significant  data  problems:  the  Carmonette’s  early  use  of 
the  ZSU-23  data  characteristics  to  model  the  divad  was  corrected  but  the 
failure  to  properly  represent  the  divad’s  visual  detection  capabilities 
resulted  in  the  use  of  incorrect  data  to  model  the  divad's  full  range. 

Other  problems  relate  to  the  Camionette’s  input  structure.  Input  data 
have  to  be  tailored  to  meet  the  model’s  logic  and  to  make  the  results 
plausible,  yet  tailoring  opens  the  possibility  that  the  end  results  will 
depend  as  much  on  the  judgment  of  the  analyst  as  on  the  manipulations 
in  the  model.  Changes  that  seem  insignificant  can  produce  a  widespread 
effect.  One  can  speculate  that  the  difficulty  in  tracing  the  reason  for  the 
divergence  in  adage  and  Carmonette  results  (discussed  in  chapter  5) 
might  be  related  to  this  tailoring.  Tailoring  data  is  time-consuming.  A 
principal  reason  for  not  including  fixed-wing  aircraft  in  the  Carmonette 
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for  the  cost  and  effectiveness  analysis  update  was  insufficient  time  to 
do  so  and  still  meet  study  constraints. 

Many  of  the  Carmonette’s  early  data  problems  were  resolved  but  it 
shares  one  problem  with  the  adage  that  still  needs  resolution — how  to 
handle  visual  detection.  Moreover,  the  Carmonette’s  data-handling 
requirements  regarding  both  time  and  tailoring  can  and  did  limit  its 
usefulness. 

The  COMO  uses  program  modules  that  describe  the  characteristics  and 
operations  of  specific  weapon  systems  at  varying  levels  of  detail, 
depending  upon  the  intended  application.  To  produce  the  Stinger  bat- 
tery-coolant-unit  usage  study,  it  was  necessary  to  increase  the  detail 
over  that  of  the  standard  Stinger  model  by  making  program  changes  to 
the  initial  Stinger  model.  While  Stinger  engineering  data  are  straightfor¬ 
ward  and  reasonably  reliable,  human-factors  data  for  Stinger  personnel 
reactions  and  functions  (such  as  detection  and  engagement  processes) 
are  less  well  understood. 

Like  the  Carmonette,  the  COMO  requires  some  tailoring  of  the  input  data, 
which  must  be  evaluated  to  determine  the  appropriate  factors  to 
include.  Thus,  as  with  the  Carmonette  the  data-tailoring  may  be  impor¬ 
tant  to  the  model’s  results. 

In  summary,  there  were  problems  with  obtaining  appropriate  input 
information  for  the  adage  and  Carmonette.  Some  of  these  problems  were 
corrected  and  some  were  not.  The  Carmonette  and  como  required  data- 
tailoring,  which  raised  the  question  about  whether  the  results  depended 
as  much  on  the  data-tailoring  as  on  the  models’  manipulations  of  the 
data.  All  the  models  used  recognized  data  sources,  although  not  necessa¬ 
rily  the  same  sources.  A  result  of  differing  sources  could  be  differing 
quality  of  data  inputs.  Data  problems  did  occur,  at  least  in  the  adage 
and  Carmonette,  and  some  of  these  problems  were  related  to  challenges 
of  the  results.  Data-structuring  presented  problems  to  both  the  adage 
and  Carmonette.  The  Carmonette’s  extensive  structuring  requirements 
prevented  a  timely  inclusion  of  fixed-wing  aircraft.  More  attention  to  a 
model’s  data  requirements  should  improve  its  usefulness  and  credibility. 


Summary 


Our  review  of  the  adage  and  Carmonette  models  of  the  divad  and  the 
como  III  model  of  the  Stinger  led  us  to  these  conclusions: 
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The  Carmonette  has  sound  theory  for  a  combined-ai  ms  analysis,  but  its 
approach  is  not  the  most  appropriate  for  decisions  regarding  competing 
air  defense  weapons.  The  adagk  and  como  III  were  designed  with  such 
decisions  in  mind. 

All  three  models  have  specific  strengths  in  dealing  with  the  critical 
aspects  of  air  defense  weapons  but  all  also  have  serious  weaknesses. 

All  three  models  are  in  some  respect  restricted  and  incomplete  in  their 
coverage  of  the  combat  arena. 

Of  the  three  models,  the  adagk  has  the  greatest  number  of  basic  mathe¬ 
matical  and  logical  flaws  that  raise  concerns  about  the  credibility  of  its 
results. 

All  three  models  address  operational  measures  of  effectiveness,  but  the 
adage  appears  to  relate  its  measures  more  closely  to  protection,  the  ulti¬ 
mate  mission  of  air  defense,  while  the  other  models  stress  loss-exchange 
ratios. 

The  adage  and  Carmonette  simulations  of  the  divad  both  had  problems 
with  obtaining  appropriate  data,  and  these  problems  affected  the  credi¬ 
bility  of  the  simulation  results;  the  Carmonette  and  como  III  require 
extensive  tailoring  of  the  data,  and  the  effects  of  this  cannot  be  easily 
distinguished  from  manipulations  of  the  models. 

All  the  models  we  reviewed  had  advantages  that  made  them  applicable 
for  answering  certain  issues  and  disadvantages  that  detracted  from 
their  usefulness.  We  recognize  that  it  is  practically  impossible  for  a  sim¬ 
ulation  to  fully  address  all  aspects  of  an  issue.  The  question  becomes,  Is 
the  simulation  sufficiently  applicable  to  address  the  critical  aspects  of 
the  issue? 

The  basic  theoretical  approach  of  models  is  a  key  consideration.  The 
adage  and  como  are  functional  models,  designed  to  compare  specific 
types  of  air  defense  weapons,  whereas  the  Carmonette  is  a  combined- 
arms  model,  focusing  primarily  on  alternative  strategies  in  ground  war. 
From  this  perspective,  the  adage  and  como  are  pernaps  more  appropri¬ 
ate  than  the  Carmonette  for  their  purpose — deciding  between  air 
defense  weapons  in  a  given  scenario.  Even  critics  of  the  adage  agree  to 
its  usefulness  this  purpose. 

None  of  the  models  fully  address  the  tactical  environment  of  the  weai*oo 
system  studied;  nevertheless,  each  has  definite  strengths.  The  adage's 
strengths  lay  in  its  portrayal  of  the  divad  gun  in  its  intended  environ 
ment  and  in  its  coverage  of  helicopter  and  fixed-wing  targets  and  \  hen 
ability  to  inflict  serious  damage.  The  Carmonette’s  strengths  were  u- 
portrayal  of  the  ground  battle  and  its  dynamic  interactions  lake  tin 
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adage,  the  como  portrays  an  appropriate  air  defense  environment  with 
its  essentially  unlimited  battle  size. 

The  Carmonette’s  weaknesses  prevent  the  model  from  completely  simu¬ 
lating  air  defense,  since  its  scope  is  too  small  and,  until  recently,  it  failed 
to  address  fixed-wing  aircraft,  a  principal  target  set.  The  como’s  failure 
to  represent  the  movement  of  the  air  defense  weapon  causes  the  model 
to  overlook  portability,  a  principal  characteristic  of  the  Stinger,  while  its 
short  timespan  limits  its  usefulness  for  studying  extended  warfare.  The 
adage’s  approach  to  terrain  detracts  from  the  realism  of  its  modeling 
while  improving  the  ability  to  generalize  from  it.  In  summary,  the  adage 
and  como  address  ground-to-air  activities  reasonably  well,  while  the 
Carmonette’s  strength  lies  in  its  treatment  of  ground  activities. 

Both  the  adage  and  Carmonette  modeled  how  the  divad  gun  detected 
enemy  aircraft  and  included  provisions  for  both  radar  and  visual  detec¬ 
tion.  They  addressed  visual  detection  differently  because  of  differences 
in  theory  and  input  data.  Neither  model  has  yet  appropriately  modeled 
the  divad’s  visual  detection  characteristics,  since  disagreement  over  the 
visual  detection  components  of  the  models  has  not  yet  been  resolved, 
leaving  unanswered  questions  as  to  whether  any  of  the  divad  studies 
have  appropriately  modeled  the  visual  detection  of  enemy  aircraft. 

The  como  suffers  from  some  of  the  same  shortcomings  as  the  adage  and 
Carmonette.  Like  the  Carmonette,  its  coverage  of  detection  throughout 
the  full  range  of  the  weapon  is  questionable,  and  like  the  adage,  it  lacks 
realistic  coverage  of  battlefield  obscurants.  The  Carmonette  tends  to 
give  more  complete  coverage  to  radar  phenomena  than  the  adage  but 
only  after  significant  model  changes.  Radar  was  not  applicable  to  the 
Stinger  in  the  como. 

While  the  adage  and  Carmonette  address  the  same  basic  phenomena  in 
modeling  an  engagement  between  the  divad  and  an  approaching  aircraft, 
differences  could  affect  the  acceptability  of  some  of  the  results — some 
favoring  the  adage  and  some  the  Carmonette.  All  things  considered,  the 
Carmonette  probably  models  the  engagement  of  enemy  aircraft  better 
than  the  adage,  since  it  models  more  phenomena  directly  and  uses  Monte 
Carlo  throughout.  Nevertheless,  the  Carmonette  suffers  from  a  more 
basic  problem;  it  does  not  model  the  divad  in  its  intended  environment. 

We  found  differing  emphases  on  the  aspects  of  the  air  and  ground  wars 
that  were  modeled  and  how  they  interacted.  All  the  models  failed  to 
address  certain  aspects  of  modem  warfare  and  addressed  other  aspects 
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inadequately,  limiting  the  insights  to  be  gained  about  the  effectiveness 
of  new  weapons  in  battles  of  the  future.  Throughout  much  of  the  model¬ 
ing  effort  on  the  divad,  the  Carmonette  gave  inadequate  coverage  to 
fixed-wing  aircraft.  The  adage’s  expected-value  treatment  of  the  ground 
war  raised  concerns  about  its  credibility. 

The  adage  covered  nearly  all  aspects  of  the  air  war,  including  the  dam¬ 
age  to  division  ground  assets  by  enemy  fixed-wing  aircraft.  The 
Carmonette  analysts  did  not  include  the  effects  of  fixed-wing  aircraft  in 
the  Carmonette,  ignoring  it  completely  in  early  studies  and  relying  on 
data  from  other  models  in  later  studies. 

The  Carmonette  was  designed  almost  30  years  ago  to  simulate  small-unit 
ground  combat  and  addresses  nearly  all  aspects  of  combined-arms 
ground  warfare.  The  adage  does  not  play  the  ground  war  directly  but 
relies,  instead,  on  externally  generated  attrition  rates.  The  adage’s 
approach  to  modeling  the  ground  war  attrition  input  data  and  its  esti¬ 
mates  of  air-to-ground  damage  are  principal  areas  of  disagreement  lor 
its  critics.  The  como  does  not  play  ground  war  at  all. 

The  preservation  of  ground  assets  is  the  primary  function  of  air  defense. 
The  adage  addressed  this  in  its  analyses,  but  the  study  advisory  group 
appeared  to  be  reluctant  to  consider  requiring  this  measure  in  the 
Carmonette  analyses.  Consequently,  the  Carmonette  results  concentrate 
on  various  exchange  ratios  that  are  principally  attrition  oriented.  Since 
the  como  did  not  play  the  ground  war  at  all.  its  ability  to  address  the 
protection  of  forward-area  assets  was  limited.  Therefore,  of  the  three 
models  we  reviewed,  only  the  adage  addressed  air  defense  weapons 
in  their  primary  roles.  However,  even  the  adage  failed  to  address 
one  important  aspect  of  air  defense  that  was  addressed  by  the 
Carmonette — the  ability  of  air  defense  weapons  to  cause  aircraft  to 
abort  their  missions. 

The  adage  fails  to  address  explicitly  the  time  and  spatial  relationships 
of  a  many-on-many  raid,  relying  rath  >r  on  expected-value  calculations. 
How  much  this  theoretical  and  mathematical  problem  detracts  from  the 
results  is  difficult  to  determine  because  of  the  concurrent  problems  asso¬ 
ciated  with  the  input  data.  While  we  found  some  fundamental  errors  in 
the  theoretical  approach  to  modeling  air  defense,  many  of  the  problems 
we  noted  appeared  to  deal  with  the  appropriateness  of  data  inputs. 
Sometimes  the  problems  with  the  characterization  of  a  phenomenon  and 
its  environment  stemmed  from  using  inaccurate  data  to  achieve  a  cor¬ 
rect  theoretical  approach. 
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All  the  models  took  approaches  to  data  treatment  that  were  unique  in 
some  respects  and  each  had  peculiarities  worthy  of  note.  While  all  the 
models  obtained  their  data  from  recognized  sources,  how  they  used  the 
data  tended  to  differ.  If  the  wide  divergence  in  results  can  be  explained 
and  corrected  for,  then  the  adage  would  appear  to  be  able  to  give  the 
most  complete  treatment  of  air  defense  weapons.  Even  adding  fixed- 
wing  elements  to  the  Carmonette,  it  may  remain  less  appropriate  for  air 
defense  issues  because  of  the  level  of  battle  portrayed.  Questions  about 
the  adage’s  portrayal  of  the  ground  war  and  the  como’s  limited  modeling 
of  the  ground  war  detract  from  their  ability  to  measure  air  defense 
protection  of  ground  assets  in  a  combined-arms  environment.  The 
Carmonette  appears  capable  in  this  area,  but  protection,  air  defense’s 
primary  mission,  was  never  stressed  as  a  measure  of  effectiveness  in  the 
Carmonette  analyses. 

Attempts  continue  to  be  made  to  solve  the  problems  associated  with 
aspects  of  the  theory,  model  design,  and  input  data  in  the  Carmonette 
and  adage.  We  believe  that  as  these  efforts  continue,  both  models  may 
become  more  appropriate  for  analyses  of  the  effectiveness  of  air  defense 
weapon  systems.  As  some  of  the  problems  are  resolved,  the  results  may 
become  more  comparable — that  is,  if  the  principal  source  of  difference 
in  results  does  not  prove  to  be  the  size  of  the  battle  being  modeled.  The 
como,  however,  cannot  be  as  comprehensive  an  analysis  device  until  it 
too  addresses  the  effects  of  ground  war  activity. 
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Verification 


Determining  that  a  computer  program  performs  as  the  simulation  ana¬ 
lyst  intended  occurs  during  the  development  of  the  simulation.  Verifica¬ 
tion  efforts  should  also  occur  whenever  substantial  changes  are  made  to 
the  simulation.  Even  before  verification  begins,  some  of  its  components 
will  have  been  defined  by  the  selection  of  the  computer  simulation  lan¬ 
guage.  Programming  conventions  and  policies  such  as  structured  pro¬ 
gramming  further  define  the  context  for  verification. 

A  number  of  techniques  have  been  developed  or  adapted  to  assist  in  ver¬ 
ification.  Techniques  include  a  “structured  walk-through,’’  a  line-by-line 
code  review  performed  by  several  members  of  the  modeling  team;  pro¬ 
gram  traces,  listing  the  values  of  key  data  elements  after  each  event 
during  operation;  computer  runs  made  under  an  extremely  simplified 
scenario;  graphic  displays  of  simulation  output;  and  the  intentional 
insertion  of  errors  (or  “seeding’’)  prior  to  line-by-line  review  to  develop 
estimates  of  remaining  errors.  (We  have  summarized  evidence  of  verifi¬ 
cation  for  our  three  case  studies  in  table  5.2) 

We  were  informed  that  no  formal  verification  effort  had  been  conducted 
for  the  adage  but  that  some  line-by-line  checks  of  computer  codes  to 
develop  an  understanding  of  the  model  had  uncovered  some  problems 
that  were  corrected.  The  Carmonette,  originating  in  the  1950’s,  has 
undergone  many  changes  since  then  and  is  still  being  changed.  We  found 
no  evidence  of  verification  efforts  but  were  informed  that  the  model  has 
been  subjected  to  extensive  peer  reviews.  We  were  unable  to  document 
verification  efforts  related  to  either  the  standard  Stinger  model  or  the 
version  that  was  developed  for  the  battery-coolant-usage  analysis.  It 
was  developed  by  a  contractor,  and  the  Army  Missile  Command 
informed  us  that  the  command  performs  verification  and  validation 
tests  for  model  acceptance.  We  had  no  data  on  those  tests. 

Identifying  verification  efforts  appears  to  be  one  of  the  more  difficult 
issues  of  our  framework.  We  did  not  identify  documented  verification 
efforts  specifically  related  to  the  divad  or  the  Stinger.  Our  discussion 
and  review  of  a  number  of  simulations  lead  us  to  believe  that,  in  genera), 
there  is  no  audit  trail  to  identify  verification  efforts.  Verification  is  an 
integral  part  of  programming,  but,  like  programming,  it  is  often  not 
documented. 
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Statistical 

Representation 


Experts  in  simulation  have  noted  that  in  many  simulation  studies,  the 
greatest  time  and  money  are  spent  on  design,  development,  and  pro¬ 
gramming  and  that  relatively  little  effort  is  given  to  analyzing  a  simula¬ 
tion’s  output  data.  Since  Monte  Carlo  models  produce  results  by 
sampling  variables  represented  by  probability  distributions,  sufficient 
numbers  of  replications  and  the  appropriate  statistical  analysis  of  the 
simulation  results  are  necessary  to  allow  reasonable  confidence  that  the 
simulation  results  are  representative  of  the  model’s  true  values.  The 
objective  of  the  analysis  is  essentially  to  develop  estimates  of  both  the 
expected  value  of  outcomes  and  their  variance.  In  practice,  it  appears 
that  the  larger,  longer-running  simulations  are  less  likely  to  be  subjected 
to  this  analysis  because  of  the  major  demands  that  they  make  on  com¬ 
puter  time.  In  fact,  this  behavior  has  received  some  theoretical  support. 
As  early  as  1965,  Brooks  argued  that  only  a  few  replications  of  a  large 
battle  model  are  needed  to  get  good  estimates  of  the  gross  results 
(emphasis  ours),  provided  that  the  fate  of  a  given  weapon  has  strong 
influence  on  the  fates  of  only  a  limited  number  of  other  weapons 
(Brooks,  1965).  Many  analysts,  however,  believe  that  multiple  replica¬ 
tions  are  especially  needed  when  detailed  results  are  examined.  Some 
analysts  have  also  given  attention  to  developing  statistical  procedures 
that  will  reduce  the  required  number  of  replications.  (We  have  summa¬ 
rized  the  evidence  of  statistical  representation  in  our  three  case  studies 
in  table  5.3.) 

We  were  able  to  determine  that  substantial  attention  was  given  to  identi¬ 
fying  the  true  model  mean  for  the  Incursion  submodel  of  the  adage.  The 
only  portion  of  the  adage  model  that  is  Monte  Carlo  is  the  Incursion  sub¬ 
model,  which  produces  the  one-on-one  probabilities  of  kill  that  are  sub¬ 
sequently  used  for  each  weapon  system  modeled  in  the  Campaign 
submodel.  The  original  cost  and  operational-effectiveness  analysis  of  the 
divad  stated  that  the  Incursion  had  undergone  “a  sufficiently  large 
number  of  trials”  before  the  probability  of  kill  of  an  average  engage¬ 
ment  was  calculated.  Analysts  involved  with  the  adage  informed  us  that 
each  Incursion  scenario  was  replicated  500  times  in  producing 
probability -of-kill  results.  They  noted  that  these  replications  yielded  a 
98-percent  level  of  confidence  that  the  Incursion  results  were  within  1 
to  2  percent  of  the  true  mean,  although  the  specifics  are  not  presented  in 
their  reports.  The  adage  analysts  further  stressed  that  this  practical 
ability  to  generate  a  large  sample  size  is  an  advantage  that  the  adage 
has  over  the  Carmonette. 

Because  the  Carmonette  requires  substantial  computer  time,  only  a  lim¬ 
ited  number  of  replications  are  available  to  establish  confidence  that 
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results  have  stabilized.  In  both  the  1984  divad  update  and  the  1985  com¬ 
parative  analysis,  the  minimum  number  of  replications  required  was  10. 
After  10,  the  analysts  conducted  statistical  analyses  to  determine 
whether  the  results  had  stabilized.  The  criterion  for  determining  stabili¬ 
zation  was  an  85-percent  level  of  confidence  that  the  results  were  within 
10  percent  of  the  true  mean.  The  principal  measures  to  which  this  crite¬ 
rion  was  applied  were  total  enemy  losses  and  total  friendly  losses.  In  the 
1984  update,  21  of  26  scenarios  tested  met  the  confidence  criterion 
within  the  minimum  10  replications.  The  largest  number  of  replications 
needed  was  17.  In  the  comparative  analysis,  the  analysts  determined 
that  all  29  scenarios  tested  met  the  criterion  within  the  original  10  repli¬ 
cations.  For  2  scenarios  in  the  update  and  for  one  in  the  comparative 
analysis,  however,  the  Carmonette  analysts  accepted  scenarios  as  stabi¬ 
lized  that  only  approached,  but  did  not  meet,  the  10-percent  precision 
factor. 

The  Carmonette  analysts  elected  to  measure  stabilization  on  total  enemy 
and  total  friendly  losses — rather  than  enemy  aircraft  killed  by  divad  and 
vice  versa — because  of  the  small  number  of  guns  and  targets  available 
in  the  battalion  scenario.  Nevertheless,  it  is  noteworthy  that  these  out¬ 
put  measures  were  not  nearly  as  stable  as  total  losses,  several  scenarios 
showing  standard  deviations  as  large  as  or  larger  than  the  mean. 

The  Carmonette  analysts  justified  the  decision  not  to  ran  additional  rep¬ 
lications  to  stabilize  these  variables  by  stating  that  since  the  mean  val¬ 
ues  were  so  small,  more  replications  would  not  necessarily  produce  a 
significant  difference  in  the  computed  mean.  Since  standard  deviations 
on  these  variables  are  often  large,  relative  to  the  mean  values,  it  would 
seem  that  wide  variations  in  mean  values  could  still  occur.  Unfortu¬ 
nately.  the  reports  do  not  contain  enough  information  to  judge  the  vola¬ 
tility  of  potential  variation  in  values  or  whether  the  values  were 
beginning  to  converge  at  all  on  the  acceptance  criterion.  Since  the  divi¬ 
sional  tactical  environment  for  the  divad  includes  the  coverage  of  36 
guns,  whereas  the  battalion-level  Carmonette  covers  only  4  guns,  there 
is  some  concern  about  the  possible  effects  of  using  unstable  results  from 
a  battalion-level  model  for  projecting  the  operational  effectiveness  of 
the  divad  in  its  tactical  division  environment. 

The  Stinger  battery-coolant-unit  usage  study  addressed  the  requirement 
for  coolant  units  under  varying  conditions  of  visibility  and  types  of 
threat  and  supporting  air  defense  systems.  All  the  computer  runs  gener¬ 
ating  the  data  were  part  of  a  total  como  simulation.  An  implicit  assump¬ 
tion,  however,  that  one  ran  for  each  set  of  conditions  was  sufficient 
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raises  the  issue  of  the  number  of  replications  required  for  large-scale 
simulations. 

The  battery-coolant-unit  study  included  a  total  of  1 1  computer  runs,  one 
for  each  scenario.  If  multiple  replications  of  at  least  one  of  the  scenarios 
had  been  made,  the  analysts  might  have  been  better  able  to  assess 
whether  the  values  produced  by  one  run  were  near  the  true  mean.  There 
was  no  indication  in  the  report  as  to  the  variability  of  results — no  calcu¬ 
lation  of  model  mean  or  variance.  No  reason  was  given  for  this  omission. 
The  analysts  may  have  believed  that  a  single  run  for  each  scenario  was 
acceptable  because  of  the  large  number  of  Stinger  units  operating  within 
the  simulation,  but  no  arguments  were  advanced  to  suggest  or  support 
this  rationale.  The  variability  of  output  results  may  have  been  tested 
when  Army  personnel  performed  validation  and  verification  testing  on 
receiving  the  model  from  the  contractor,  but  this  was  not  documented  in 
the  report. 

In  the  adage  and  Carmonette  cases,  the  evidence  indicates  that  the  ana¬ 
lysts  recognized  the  need  to  estimate  some  of  the  true  model  values.  The 
credibility  of  the  adage  simulation  benefited  from  the  multiple  replica¬ 
tions  used  to  develop  statistically  representative  values.  In  the 
Carmonette  analysis,  it  is  not  clear  that  true  model  values  were  deter¬ 
mined  for  enemy  aircraft  killed  by  the  divad  and  divad  guns  killed  by  the 
enemy,  although  the  attempt  was  made  to  determine  them.  In  the 
Carmonette  analysis  and  implicitly  in  the  battery-coolant-unit  study, 
there  is  an  indication  that  the  analysts  tended  to  combine  testing  for 
underlying  true  model  values  with  testing  for  changes  in  results  stem¬ 
ming  from  parameter  and  scenario  changes.  This  practice  leads  to  a  con¬ 
fusion  of  two  important  but  distinct  areas  and,  thus,  to  a  decline  in 
credibility.  The  Carmonette  analysis  did  use  multiple  replications  in  its 
scenarios  that  enhanced  its  credibility  in  the  development  of  statistically 
representative  values,  but  there  are  still  some  concerns  about  the  stabil¬ 
ity  of  some  of  its  results.  There  was  no  evidence  that  the  statistical  rep¬ 
resentativeness  of  the  como  simulation  was  determined  for  either  the 
model  or  the  scenarios. 


Sensitivity  Testing 


Sensitivity  testing  identifies  how  changes  in  a  model's  parameters  affect 
the  results  in  both  direction  and  magnitude  and  provides  feedback  of  the 
model’s  behavior  to  the  analyst.  When  changes  extend  beyond  the  alter¬ 
ation  of  parameters,  the  process  is  recognized  as  the  testing  of  alterna¬ 
tive  scenarios.  Parameter  testing  is  most  likely  to  be  explored  early  in  a 
simulation’s  development,  but  scenario  testing  is  generally  performed  in 
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response  to  particular  questions  about  the  system’s  effectiveness  under 
a  range  of  threats  and  conditions.  (We  have  summarized  the  evidence  of 
testing  for  sensitivity  to  parameters  and  alternative  scenarios  in  table 
5.4.) 

The  adage  modelers  dealt  explicitly  with  sensitivity  testing  and  testing 
for  uncertainty,  covering  both  in  their  published  reports.  In  the  1985 
comparative  analysis,  they  conducted  sensitivity  analyses  (or  “paramet¬ 
ric  analysis”)  on  four  parameters:  operational  availability,  reaction 
time,  aim  bias  (that  is,  the  offset  of  the  center  of  the  aim  distribution 
from  the  target),  and  angular  aim  error  (that  is,  the  dispersion  of  aiming 
points  around  that  center).1  The  report  indicates  that  operational  availa¬ 
bility,  reaction  time,  and  angular  aim  error  were  critical  parameters  in 
determining  the  diyad's  effectiveness.  Moreover,  the  direction  of  changes 
in  results  was  logically  consistent  with  the  direction  of  changes  in 
parameter  values. 

The  adage  modelers  included  a  chapter  on  uncertainties  in  the  original 
cost  and  operational-effectiveness  analysis  report.  Their  concerns  about 
uncertainty  in  the  modeling  were 

•  operational  employment  concepts  visualized  for  each  weapon  system, 

•  environments  in  which  systems  may  be  placed  on  battlefields  of  the 
future, 

•  threat  levels  and  tactics  to  be  encountered, 

•  system  performance  characteristics  that  directly  affect  effectiveness 
inputs. 

The  first  element  of  uncertainty  dealt  with  the  use  of  weapons  such  as 
rifles  and  tanks  in  an  air  defense  role,  and  analyses  showed  enough  dif¬ 
ference  to  conclude  that  ground  weapons  should  be  integrated  into  the 
air  battle  and  air  defense  weapons  into  the  ground  battle.  The  environ¬ 
mental  aspects  dealt  principally  with  the  effect  of  uncertainties  in  visi¬ 
bility,  and  the  results  show  an  extreme  sensitivity  to  visibility.  The 
threat  uncertainties  dealt  principally  with  expected  enemy  tactics  and 
indicated  significant  increases  in  damage  from  heavily  concentrated 
first-day  enemy  assaults.  The  effectiveness  uncertainties  were 


'This  dispersion  is  distinguished  from  ballistic  dispersion.  Angular  aim  error  deals  with  the  variabil¬ 
ity  of  a  gunner  s  aiming  ability,  ballistic  dispersion  is  a  function  of  the  gun  barrel  and  the  projectile 
and  is  sometimes  referred  to  as  "mund-to-round  error  " 
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addressed  in  two  ways,  first  by  holding  the  divad’s  effectiveness  con¬ 
stant  and  degrading  the  relative  effectiveness  of  other  air  defense  sys¬ 
tems  and,  second,  by  allowing  all  air  defense  systems,  including  the 
divad,  to  be  equally  degraded  in  effectiveness.  The  changes  in  effective¬ 
ness  showed  that  the  divad  held  up  well. 

While  there  is  evidence  that  the  Carmonette  modelers  have  conducted 
sensitivity  testing  on  input  parameters  in  the  past,  the  published  reports 
based  on  the  Carmonette  analyses  of  the  divad  do  not  cover  such  testing. 
However,  some  of  the  scenarios  varied  so  little  that  they  were  essen¬ 
tially  the  same  as  sensitivity  testing.  The  Carmonette  analysts 
addressed  26  different  scenarios  in  the  1984  update  and  25  in  the  1985 
comparative  analysis.  Many  of  these  varied  conditions  too  much  to  be 
called  sensitivity  tests  (for  example,  the  presence  or  absence  of  air 
defense,  the  presence  or  absence  of  certain  types  of  weapon  systems); 
others  changed  conditions  only  slightly  and.  therefore,  are  similar  to 
sensitivity  tests. 

Although  analyses  in  both  Carmonette  studies  showed  that  changes  in 
battlefield  visibility  had  significant  effects  on  the  divad  versus  enemy 
aircraft  effectiveness,  these  effects  were  small  and  had  only  small 
effects  on  overall  battlefield  outcomes.  The  update  analyzed  the  differ¬ 
ence  between  7-kilometer  and  3-kilometer  visibility  ranges;  the  compara¬ 
tive  analysis  reported  on  the  difference  between  7-kilometer  and  16- 
kilometer  visibility  ranges.  Both  studies  also  reported  on  the  effects  of 
changing  the  mode  of  operations  for  the  divad  gun  to  show  the  effects  of 
not  using  some  of  the  divad’s  radar  capabilities  to  track  helicopters.  The 
update  reported  only  on  tests  for  7-kilometer  visibility  days  and  showed 
that  while  the  performance  of  the  divad  itself  is  extremely  sensitive  to 
the  mode  of  operation,  overall  combined  performance  changed  only 
slightly.  The  1985  comparative  analysis  reported  the  same  pattern  of 
results  for  7-kilometer  visibility  days  but  showed  little  variability  for 
16-kilometer  visibility  days.  Additional  tests  conducted  in  the  update 
included  the  effects  of  modeling  capabilities  demonstrated  in  test  firings 
versus  modeling  projected  or  matin  e  divad  capabilities.  These  results 
indicate  significant  sensitivity  in  performance  but  not  much  change  in 
overall  results.  Finally,  the  Carmonette  reports  showed  only  slight  sensi¬ 
tivity  to  different  levels  of  attacking  enemy  forces  at  the  beginning  of 
the  battle. 
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The  importance  of  a  scenario  that  encompasses  more  than  the  single 
type  of  weapon  system  was  clearly  demonstrated  in  the  Carmonette  sce¬ 
narios  that  varied  the  divad’s  capabilities  between  those  that  were  cur¬ 
rent  and  those  of  the  mature  weapon.  The  results,  if  accurate,  indicated 
that  even  though  the  gun’s  performance  improved,  there  was  little 
change  in  overall  air  defense  performance.  These  results  could  not  have 
been  developed  except  by  using  a  simulation  in  which  the  divad  was  but 
one  element  of  the  defense  system,  demonstrating  the  need  for  the 
appropriate  context  in  which  effectiveness  questions  can  be  posed. 

The  como  Stinger  battery-coolant-unit  simulation  included  sensitivity 
analyses  of  visibility.  They  were  accomplished  by  changing  only  the 
Stinger  team’s  visibility  for  several  of  the  air  defense-threat  combina¬ 
tions.  The  quality  of  this  effort  would  have  been  greatly  improved,  how¬ 
ever,  by  multiple  runs.  Sensitivity  analyses  for  some  other  weapon- 
system  models  used  in  the  como  have  also  been  performed.  We  found  a 
documented  example  of  sensitivity  analysis  performed  on  the  como 
HAWK  surface-to-air  missile  model. 

The  como  battery-coolant-unit  study,  however,  is  an  excellent  example 
of  developing  scenarios  that  could  provide  a  comprehensive  view  of  the 
simulation’s  response  under  a  broad  range  of  alternatives.  The  major 
weakness  of  the  1 1  scenarios  is  that  only  one  replication  of  each  one  was 
made.  The  insensitivity  of  results  among  the  scenarios,  relative  to  the 
battery  requirement,  suggests  that  additional  replications  were  probably 
not  needed.  Nevertheless,  it  is  poor  procedure  to  ignore  the  need  for 
some  measure  of  variance,  especially  since  there  is  no  evidence  that  ear¬ 
lier  analyses  developed  any  measure  of  variability  of  the  model’s 
results.  Even  when  the  variability  of  results  has  been  estimated,  the 
need  for  multiple  replications  of  a  scenario  must  be  carefully  considered. 

The  questions  raised  with  regard  to  a  model  are  formulated  so  that 
answers  can  be  developed  by  experimenting  with  parameters  and  scena¬ 
rios.  Scenario  testing  is  essentially  what  we  equate  with  results.  Valu¬ 
able  information  that  contributed  to  rredibility  was  developed  in  the 
adagk,  the  Carmonette.  and  the  como  by  varying  parameters  and  testing 
alternative  scenarios. 


Validation 


Validation  is  the  process  of  determining  that  a  model  is  an  accurate  rep¬ 
resentation  of,  or  agrees  with,  the  real-world  system  being  modeled.  Val¬ 
idation  includes  comparing  simulation  results  to  results  from  the  actual 
system  or  from  other  models,  historical  data,  and  operational  testing.  In 
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the  context  of  our  framework,  we  are  interpreting  validation  narrowly 
as  the  process  of  developing  confidence  in  the  simulation  results  by  com¬ 
paring  them  with  results  from  other  sources.  (We  have  summarized  our 
case  study  results  for  validation  in  table  5.5.) 

In  reviewing  simulations  related  to  the  divad.  we  found  several  examples 
of  validation  efforts  for  engineering  simulations  that  were  planned  and 
conducted  as  part  of  the  simulation  development.  The  role  of  the  Army 
Materiel  Systems  Analysis  Activity  in  validating  an  engineering  model 
of  the  divad  developed  by  Ford  Aerospace  demonstrated  the  interrelat¬ 
edness  of  verification  and  sensitivity  analysis  with  validation.  The 
effort  identified  validation  as  a  purposeful  function  within  the  decision¬ 
making  process;  and  it  described  problems  that  can  be  expected  when  a 
system’s  data  are  collected  for  validation  purposes.  In  addition,  its  docu¬ 
mentation  made  extensive  use  of  graphs  that  contributed  to  the  analysis 
of  the  results. 

In  contrast  to  the  validation  efforts  for  engineering  models,  we  found 
that  validations  of  operational-effectiveness  simulations  are  not  planned 
for  or  conducted  routinely  but,  rather,  are  undertaken  when  individuals 
or  an  organization  questions  a  disparity  in  results  between  similar  mod¬ 
els  or  between  the  model  and  real  data  or  even  between  the  model,  per¬ 
ceptions,  and  impressions.  Validation  of  the  operational-effectiveness 
simulations  in  our  case  studies  was  undertaken  to  address  the  questions 
or  issues  that  arose.  For  example,  a  comparison  of  the  modeling  of  the 
divad  by  the  adagk  and  Carmonette  was  requested  because  of  the  sub¬ 
stantial  variance  in  results  reported  for  the  two  models. 

In  another  situation,  an  undersecretary'  of  Defense  wanted  the  Army 
and  Air  Force  to  jointly  review  their  models,  the  COMO  and  SORTIE,  to 
understand  the  substantially  lower  attrition  of  U.S.  aircraft  against 
Warsaw  Pact  air  defense  compared  to  Warsaw'  Pact  aircraft  against  the 
air  defense  of  the  North  Atlantic  Treaty  Organization.  In  another  exam¬ 
ple,  the  como  configuration  management  board,  questioning  whether  the 
simpler  como  simulations  yielded  information  similar  to  that  of  the  more 
complex  ones,  requested  a  study  that  would  corroborate  the  output  of 
simpler  and  more  complex  Patriot  simulations.  The  results  of  this  study 
lent  credibility  to  the  como  integrated  air  defense  model,  which  uses  sim¬ 
pler  weapon-system  models.  In  the  following  discussion,  we  explain  why 
some  of  these  simulations  were  important  to  our  case  studies. 

When  we  made  our  review,  no  formal  validation  efforts  had  been  per¬ 
formed  on  the  adagk.  Because  the  adagk  modeled  combat  at  the  division 
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level,  test  data  were  generally  not  available  for  performing  validation 
efforts,  since  tests  are  not  conducted  at  that  level.  Perhaps  because  the 
Carmonette  has  been  in  existence  so  long,  it  has  come  to  be  viewed  as  a 
"standard"  against  which  to  validate  other  models  rather  than  as  a 
model  requiring  validation  itself.  We  did,  however,  identify  one 
Carmonette  validation  effort  reported  in  1975,  wh°n  the  Army  Concepts 
Analysis  Agency  compared  the  results  of  the  Carmonette  with  a  tank 
warfare  field  experiment.  The  use  of  both  the  Carmonette  and  adagk  to 
model  the  r>iVAi>  was  really  an  attempt  at  validation  because  original 
adagk  results  had  not  been  well  received  in  some  circles.  Part  of  the  jus¬ 
tification  originally  given  for  using  the  Carmonette  to  analyze  the  divad 
was  to  provide  insight  into  certain  key  parameter  values  used  in  the 
ADAGK. 


Because  early  efforts  showed  Carmonette  results  that  diverged  front 
adagk  results,  a  combat  development  study  plan  was  adopted  in  January 
1984.  It  established  a  study  advisory  group  and  described  the  tasks  and 
responsibilities  necessary  for  correcting  model,  scenario,  and  data  prob¬ 
lems  discovered  in  the  adagk  and  Carmonette  models.  These  problems 
included  correcting  the  Carmonette's  model  of  the  divad.  modeling  the 
niYAD  directly  instead  of  the  ZSC-23.  and  including  the  divad's  primary 
mode  of  operations.  The  effort  was  to  go  beyond  an  examination  of 
inputs  and  outputs  and  provide  a  description  and  evaluation  of  each 
simulation,  covering  structure,  scenario,  inputs,  data  usage,  and  outputs. 
Its  purpose  was  to  give  insight  into  how  a  simulation  affects  the  percep¬ 
tions  of  a  system's  performance  and  combat  effectiveness.  The  uncer¬ 
tainty  regarding  the  modeling  of  the  divad  was  great  enough  to  cause 
concern  t  hat  results  for  other  systems  such  as  the  AAII  helicopter  and 
Ml  tank  might  be  affected  if  decisionmakers  lost  faith  in  the  adagk  and 
Carmonette. 

The  considerable  concern  about  the  credibility  of  the  disproportionately 
heavy  losses  in  the  adagk  attributable  to  enemy  aircraft  was  reflected 
not  only  in  the  minutes  of  the  study  advisory  group  but  also  in  our  dis¬ 
cussions  with  dod  personnel.  Evaluating  the  legitimacy  of  this  concern  is 
difficult.  One  comparison  of  the  results  from  the  adagk,  Carmonette.  and 
other  models,  for  example,  showed  that  much  greater  damage  was 
attributable  to  enemy  aircraft  in  the  adagk  than  in  any  of  the  other  mod¬ 
els.  However,  this  comparison  included  battles  of  different  lengths  (from 
30  minutes  to  7  days)  and  different  coverage  (from  battalion  to  theater), 
so  that  direct  comparisons  are  problematic.  Moreover,  the  Carmonette 
did  not  include  ground  damage  by  fixed-wing  aircraft. 


P«Up  107 


f.AO  PFMD^tfCl  HOD  Simulation?*  for  Credibility 


Appendix  III 

Supporting  Material  for  Chapter  5 


This  comparison  did,  however,  serve  as  the  basis  for  further  analyses 
for  the  1984  update  in  which  adjustments  were  made  for  consistency  of 
inputs  and  the  scenarios  were  made  more  comparable  with  respect  to 
size  and  duration  of  battle.  Results  from  a  segment  of  the  adage  battle¬ 
field  were  compared  to  the  Carmonette  results.  Results  from  the  other 
model  were  also  normalized  to  establish  a  comparison  base.  Results  from 
the  adjusted  scenarios  showed  that  the  damage  attributable  to  enemy 
aircraft  in  the  adage  was  basically  comparable  in  the  other  models  and. 
in  fact,  somewhat  conservative.  In  similar  analyses  for  the  1985  com¬ 
parative  analysis,  the  adage  and  Carmonette  comparative  results 
diverged.  In  the  meantime,  however,  several  changes  had  been  made  to 
the  Carmonette,  principally  the  addition  of  fixed-wing  aircraft  and 
changes  in  enemy  infrared  countermeasures.  The  comparative  analysis 
report  cited  different  modeling  of  suppressing  or  aborting  helicopter 
missions  and  differing  levels  of  battle,  but  the  precise  source  of  the  new 
divergence  could  not  be  pinpointed,  since  several  changes  had  been 
made  at  one  time.  Consequently,  there  are  still  unanswered  questions 
about  which  of  the  two  models,  if  either,  produces  the  more  believable 
results. 

In  reviewing  the  adage  and  Carmonette  by  comparing  results,  we  also 
made  the  following  observations: 

•  The  adage  per-raid  attrition  results  are  close  to  the  Carmonette  raid 
results. 

•  Carmonette  analysts  admit  substantial  problems  in  measuring  the  effec¬ 
tiveness  of  various  air  defense  systems  because  of  the  size  of  the  model, 
the  number  of  air  defense  units,  and  the  design  and  operation  of  the 
enemy  fixed-wing  aircraft. 

•  While  the  Carmonette  used  live-fire  test  results  from  Fort  Hunter  Ligget 
to  help  in  modeling  the  divad.  there  is  some  concern  as  to  whether  these 
tests  were  fair  to  the  divad,  because  test  conditions  at  Hunter  Liggett  did 
not  match  European  battle  conditions  very  well. 

•  Attempts  to  crossvalidate  the  Carmonette  against  another  model,  using 
both  to  design  thermal  pinpoint-firing  operational  tests,  were  unsuccess¬ 
ful.  The  results  from  the  two  models  were  inconsistent,  and  both  sets  of 
results  were  inconsistent  with  respect  to  the  operational  test  results. 

•  The  cost  and  operational-effectiveness  update  based  on  the  Carmonette 
tended  to  support  the  adage  conclusion  that  the  divad  was  the  preferred 
weapon. 

In  our  opinion,  there  are  still  enough  unresolved  issues,  both  here  and  as 
discussed  in  chapter  4,  to  raise  questions  about  whether  either  the 


Page  108 


GAO/PEMD-88-3  Assessing  DOD  Simulations  for  Credibility 


Appendix  111 

Supporting  Material  for  Chapter  5 


adage  or  Carmonette  has  been  “validated"  as  a  model  for  studying  air 
defense.  The  use  of  both  the  adage  and  the  Carmonette  to  study  the 
divad  did  lead  to  improvements  in  both  models,  but  more  work  still 
needs  to  be  done. 

The  como  is  an  extraordinarily  dynamic  simulation  system.  With  the 
addition  of  new  weapon  models,  differing  levels  of  detail,  multiple  sce¬ 
narios,  and  variations  for  different  computers,  addressing  the  issue  of 
validation  is  a  complex  question.  Validation  efforts  should  be  directed  at 
the  several  levels  of  simulation  at  which  the  como  model  is  run:  the 
generic  large-scale  air  defense  scenario;  the  detailed  simulation  of  a  par¬ 
ticular  weapon  system;  and  simplified,  faster-running  versions  of 
weapon  systems  that  can  be  substituted  in  some  applications  for  more 
detailed  ones. 

We  did  not  find  documented  evidence  of  a  validation  effort  specifically 
for  the  Stinger  weapon  simulation,  but  we  did  find  evidence  of  valida¬ 
tion  of  the  overall  como  model  that  lends  credibility  to  any  effort  in 
which  the  como  is  used.  We  also  found  a  validation  effort  for  the  Patriot 
that  was  sufficiently  successful  to  lead  to  similar  weapon-system 
validations. 

The  validation  effort  that  addressed  the  overall  como  model  came  about 
as  a  result  of  Army  and  Air  Force  interest  in  jointly  understanding  the 
reasons  for  major  differences  in  attrition  estimates.  This  analysis  was  in 
a  sense  a  validation  of  two  models  using  the  overall  similarity  of  results 
as  the  measurement  device.  The  great  disparity  in  results  bet  ween  the 
como  and  the  SORTIK  suggest ea  initially  that  either  one  or  both  models 
had  serious  failings.  In  May  and  .June  1980,  Air  Force  and  Army  evalu¬ 
ators  met  to  review  the  scenarios,  input  data,  structure,  and  assump¬ 
tions  of  the  two  models  and  each  group  adjusted  its  model  to  reflect 
agreed-upon  standard  conditions  and  assumptions.  The  results,  which 
were  overall  measures  of  attrition,  indicated  that  the  models  are  in  good 
agreement  when  simulating  similar  conditions.  The  original  differences 
were  primarily  attributed  to  different  estimates  of  system  effectiveness 
and  differences  in  aircraft  attack  philosophies,  goals,  and  doctrine.  This 
resolution  lent  credence  to  each  model  and  suggested  factors  likely  to  bo 
modified  in  military  planning.  The  emphasis  was  on  the  selection  of 
input  data  that  accurately  represented  operational  eondit .<  ns.  Test  pro¬ 
cedures  for  sensitivity  analysis  wer*'  present  in  the  form  of  limited  vari¬ 
ation  of  individual  factors  and  determining  their  effect  on  results.  This 
practical  effort  addressed  the  credibility  issues  of  interest  to  staff  at 
high  levels  of  the  lX*partment  of  Defense. 
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The  validation  for  the  Patriot  simulations  used  with  the  COMO  was  a  rig¬ 
orous  evaluation  effort.  The  quantity  of  input  data  was  kept  at  an 
experimental  level  rather  than  extensive.  Most  of  the  processing  was 
limited  to  single  or  several  aircraft  against  a  single  Patriot  battery. 

Thus,  the  results  could  be  closely  analyzed  and  an  explanation  for  dif¬ 
ferences  could  be  determined.  As  substantial  as  such  validation  efforts 
must  be,  they  provide  a  degree  of  credibility  that  is  probably  not 
matched  by  any  other  weapon-system  simulations  with  the  oomo. 

Attempts  to  validate  the  adage  and  Carmonette  simulation  results  with 
each  other  and  with  those  from  other  models  have  not  been  completely 
successful.  Differences  between  simulations  make  comparison  quite  dif¬ 
ficult  and,  while  some  results  have  been  basically  comparable,  under 
other  conditions  they  have  diverged.  In  one  major  comparison  between 
the  adage  and  Carmonette,  there  was  a  reasonable  correspondence 
between  results  that  did  not  carry  over  to  a  second  major  comparison. 
There  was  no  evidence  of  the  use  of  historical  data.  Operational  data 
were  used  as  input  to  the  Carmonette  but  not  comparatively. 

The  comparison  between  overall  simulation  results  demonstrated  in  the 
como-SORTIE  analysis  did  not  approach  the  rigorous  standards  of  the 
Patriot  analysis,  but  such  analyses  become  less  feasible  as  the  compared 
simulations  encompass  a  larger  and  more  complex  environment  and  rep¬ 
resent  more  divergent  modeling  approaches.  There  was  no  documented 
evidence  of  validation  for  the  Stinger  battery-coolant-unit  model. 


Independent 
Validation  Efforts 


In  the  validation  efforts  described  in  chapter  5,  work  was  performed  by 
the  organization  developing  the  mcdel  or  one  organizationally  related. 
The  Army  Material  Systems  Analysis  Activity  provided  an  independent 
review  of  the  Ford  divad  gun  model  but  had  previously  participated  in 
writing  its  design  specifications.  In  the  air  battle  attrition  validation 
effort,  the  Air  Force  and  Army  operated  their  own  simulations  (the 
SORTIE  and  the  como).  The  adage  and  Carmonette  were  modified  or  cor¬ 
rected  by  their  respective  organizations  but  under  the  review  and  direc¬ 
tion  of  the  study  advisory  group.  These  instances  approximated 
independent  validations  that  may  enhance  a  model's  credibility.  We 
identified  two  others,  not  part  of  our  case  study  analyses,  that  are  note¬ 
worthy  because  of  the  efforts  made  to  ensure  impartial  evaluation. 

In  1983,  the  Center  for  Naval  Analyses  prepared  a  report  for  the  Navy's 
Harpoon  (an  antiship  missile)  project  office  to  assist  in  selecting  one  or 
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more  models  that  represented  the  state  of  the  art  in  evaluating  the  Har¬ 
poon’s  performance.  The  center  was  asked  to  provide  a  detailed  compar¬ 
ison  of  six  widely  used  models  that  had  been  developed  by  various 
naval  laboratories  and  contractors.  The  models  were  developed  for  dif¬ 
ferent  applications,  and  as  the  scenarios  were  made  more  complex,  some 
models  were  not  applicable.  Some  of  the  models  ran  on  mainframe  com¬ 
puters  and  others  on  minicomputers.  Comparisons  appeared  to  be  for 
overall  similarity  of  results,  with  some  comparison  of  specific  events 
between  models.  The  data  that  were  produced  did  not  allow  statistical 
analyses  of  important  model  details  such  as  flight  paths,  for  example. 
When  disagreements  were  found,  the  center  attempted  to  find  the  cause 
and  gave  the  developers  of  the  models  the  opportunity  to  make  correc¬ 
tions  and  rerun  the  scenario.  When  differences  appeared  to  be  the  result 
of  the  modeling  approach  or  basic  assumptions,  the  apparent  causes  and 
results  were  documented.  One  especially  interesting  outcome  was  that 
several  of  the  models  gave  results  for  a  many-on-many  scenario  that 
were  nonintuitive  and  would  not  have  been  suggested  by  the  one-on-one 
scenarios. 

In  the  second  case,  Sandia  National  Laboratories  had  developed  a  model, 
the  SANDEMS,  for  analyses  related  to  a  surface-to-air  missile.  A  com¬ 
mittee  of  users,  representing  various  naval  commands,  laboratories,  and 
contractors  recommended  that  the  model  undergo  validation  so  that  its 
results  would  have  more  credibility  with  the  Navy  and  be-  formally 
accepted  for  the  Navy’s  AEGIS  project. 

While  there  is  no  formal  process  of  models  review  at  t  he  Sandia  labora¬ 
tory,  the  developers  had  subjected  the  SANDEMS  to  an  informal  valida¬ 
tion  and  verification.  The  independent  reviewers  agreed,  however,  that 
it  was  not  a  full-scale  validation  of  all  the  aspects  of  the  model.  The  only 
documentation  available  was  preliminary,  partial,  and  in  some  cases 
made  up  of  obsolete  subroutine  descriptions  and  the  program  listing. 

The  Applied  Physics  Laboratory,  a  member  of  the  committee,  was  asked 
to  undertake  the  validation  effort.  RCA  Corporation,  also  a  member, 
provided  assistance,  The  laboratory  selected  personnel  who  had  exten¬ 
sive  experience  in  the  development  and  use  of  naval  surface-to-air  mis¬ 
sile  engagement  models.  They  developed  a  detailed  checklist  as  a 
framework.  (1)  steps  in  validation  (purpose  of  model,  completeness, 
realism,  correctness  of  data  and  computations,  flexibility  for  expan¬ 
sion),  (2)  structural  description  of  the  model.  (3)  scenario  capability,  (4) 
model  output,  (5)  nuclear  effects,  and  (ti)  modeled  processes  Test  runs 
for  some  scenarios  agreed  with  accepted  results  from  other  models  For 
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other  scenarios,  the  analysts  relied  on  examining  the  logic  of  the  model 
because  there  were  no  other  models  with  broad  Navy  acceptance  to 
which  they  could  compare  the  results. 

The  review  identified  the  limitations  of  the  model  in  terms  of  what  it 
represented  and  what  it  failed  to  represent  or  represented  only  par¬ 
tially.  The  review  also  described  the  conditions  that  could  and  could  not 
be  validly  addressed  because  of  the  limitations.  Early  in  the  assessment 
effort,  the  analysts  noted  that  “The  items  judged  to  be  critical  for 
SANDEMS  validity  will  depend  on  the  intended  purposes  of  the  model,” 
thus  recognizing  that  validity  depends  on  context. 


Summary 


We  recognize  that  verification  is  substantially  integrated  with  the  pro¬ 
gramming  process  and  that  documentation  of  the  process  has  been 
sparse,  even  though  the  documentation  of  the  programmer’s  product 
may  be  quite  complete.  We  think  that  simulation  users  are  entitled  to 
some  knowledge  of  the  verification  efforts  in  a  simulation’s  develop¬ 
ment  and  that  such  information  strengthens  the  credibility  of  simula¬ 
tions.  It  can  be  incorporated  into  existing  documentation.  We  have  seen 
that  when  questions  about  credibility  are  raised,  other  analysts  brought 
in  to  assess  a  simulation  do  perform  their  own  verification  efforts.  We 
take  this  as  evidence  of  the  importance  of  verification  and  the  need  for 
some  recording  of  it. 

In  our  case  studies,  the  longer-running,  more  complex  simulations  were 
evaluated  with  fewer  simulation  runs.  If  this  represents  a  tendency  to 
treat  the  results  of  one  or  a  few  runs  of  a  complex  model  as  “true”  esti¬ 
mates,  we  see  the  potential  for  substantial  questions  about  credibility.  If 
the  true  values  of  simulation  are  not  known,  then  one  really  never 
knows  the  degree  of  comparability  between  the  model  and  the  real 
world  or  between  the  model’s  results  and  the  results  from  other  models. 

We  note  that  the  number  of  simulation  runs  is  not  the  sole  criterion  to  be 
used  in  developing  these  estimates.  Various  statistical  measures  have 
been  and  are  being  developed  to  increase  the  efficiency  of  the  estimating 
process,  but  the  basic  issue — confidence  that  the  simulation  results  are 
an  accurate  reflection  of  the  model’s  underlying  values — is  important 
and  requires  recognition.  Analysts  working  with  very  large,  long-run¬ 
ning  simulations  such  as  the  como  should  try  to  develop  or  identify 
methods  in  which  confidence  levels  based  on  results  from  individual 
model  components  can  be  incorporated  into  the  total  estimation  process, 
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so  that  less  demand  will  be  made  on  computer  resources  for  a  given  level 
of  confidence. 

The  sensitivity  testing  of  parameters  and  the  testing  of  scenarios  was 
treated  effectively  in  the  adage,  Carmonette,  and  como.  In  fact,  the 
apparent  need  is  to  integrate  parameter  and  scenario  work  with  the 
work  of  determining  the  true  estimates  for  a  simulation.  When  the  true 
underlying  results  are  not  determined,  then  the  simulation  results  are 
open  to  question.  Analysts  have  a  firmer  foundation  upon  which  to  dis¬ 
cuss  the  results  of  variations  in  parameters  and  changes  in  scenarios 
when  the  underlying  information  requirements  have  been  developed. 

Validation  is  an  appealing  and  potentially  powerful  method  of  raising  a 
simulation’s  credibility.  For  all  its  attraction,  however,  it  does  not 
appear  to  have  been  used  in  the  adage,  Carmonette,  and  como  as  a  mat¬ 
ter  of  course.  Instead,  validation  efforts  were  initiated  when  questions 
were  raised  about  credibility.  We  found  some  como  weapon  simulations 
in  which  validation  contributed  to  credibility.  Validation  based  on 
model-to-model  comparisons,  typified  by  the  como  Patriot  simulation 
work  and  the  Harpoon  comparison,  contributes  importantly  to  a  model’s 
credibility  and  should  be  performed  routinely,  not  merely  in  an  ad  hoc 
effort  to  respond  to  questions  or  criticism. 
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Support  for  Design, 
Data,  and  Operations 


Configuration 

Management 


Institutional  practices  can  help  ensure  that  credible  simulations  are 
established  and  maintained.  Two  such  practices  that  we  found  in 
reviewing  the  adage,  Carmonette,  and  como  were  configuration  manage¬ 
ment  and  the  use  of  oversight  and  review  groups.  (We  have  summarized 
support  structures  for  the  design,  data,  and  operations  of  our  case  stud¬ 
ies  in  table  6.2.) 


The  1982  tradoc  regulation  entitled  “Management:  tradoc  Models”  pro¬ 
vides  guidance  on  managing  models  with  considerable  attention  to  the 
control  functions.  It  states  that  “only  one  agency  designated  by  HQ 
[Headquarters]  tradoc  will  be  responsible  for  the  configuration  manage¬ 
ment  and  development  of  software  and  data  base  maintenance  of  each 
model.”  That  agency  may  provide  a  model  to  other  tradoc  agencies,  but 
the  receiving  agency's  changes  in  the  model  are  to  be  only  for  internal 
use  and  must  be  coordinated  with  the  responsible  agency.  Changes  made 
for  internal  use  are  not  to  leave  the  receiving  agency  and  not  to  incorpo¬ 
rate  routines  that  change  the  nature  of  the  model.  All  other  changes  are 
to  be  made  only  by  the  responsible  agency. 

This  regulation  further  designates  tradoc  service  schools  (like  the  Army 
Air  Defense  Artillery  School)  responsible  for  developing  configuration 
control  and  improving,  operating,  and  maintaining  models  that  permit 
the  evaluation  of  two-sided  military  engagement  in  which  a  single  func¬ 
tion  of  combat  is  considered  in  detail.  The  schools  are  to  assist  all  other 
users  in  the  development,  improvement,  and  operation  of  their  specific- 
functional  battlefield  “modules”  included  in  other  combat  arms  and 
agency  models  at  all  levels.  The  adage  and  como,  for  example,  are  mod¬ 
els  for  air  defense. 

While  the  regulation  indicates  that  the  Army  Air  Defense  Artillery 
School  would  be  assigned  management  responsibilities,  adage  is  con¬ 
trolled  by  the  Army  Material  Syst  ems  Analysis  Activity  (the  original 
developer),  and  the  como  is  assigned  to  the  U.S.  Army  Missile  Command. 
Not  only  does  this  differ  from  th<>  current  regulation  but  it  also  puts  the 
functional  operational-effectiveness  models  under  the  control  of  the 
agencies  developing  weapon  systems,  which  gives  the  appearance  of  a 
possible  conflict  of  interest,  tradoc  Systems  Analysis  Activity  is  desig¬ 
nated  to  develop  and  operate  force-on-force  combat  development  models 
at  the  battalion  level,  such  as  the  Carmonette  and  others.  It  is  responsi¬ 
ble  for  the  Carmonette  but  other  versions  of  the  Carmonette  are  used  by 
other  organizations  such  as  the  Concepts  Analysis  Agency. 
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For  all  tkadoc  models,  the  regulation  designates  the  deputy  chief  of 
staff  for  doctrine  responsible  for  ensuring  that  doctrine,  future  con¬ 
cepts,  and  threat  are  properly  portrayed.  Other  organizations  are 
responsible  for  establishing  and  maintaining  the  actual  data  used  in  the 
simulations.  Although  the  modelers  obtained  their  input  from  these 
sources,  the  information  was  not  always  current,  complete,  or  compati¬ 
ble  with  a  model’s  data  requirements.  For  example,  no  organization  had 
an  updated  scenario  that  could  be  used  for  the  adage  analyses  of  the 
divad.  Similarly,  the  probability-of-kill  information  had  to  be  manipu¬ 
lated  before  it  could  be  used  in  the  Carmonette  analyses  of  the  divad. 

Of  the  three  simulations  we  reviewed,  only  the  COMO  had  an  established 
interagency  group  that  focused  on  configuration  control.  In  1980,  a  oomo 
models  management  board  was  established  that  developed  a  baseline 
como  III  software  ensemble,  produced  a  management  plan,  established 
working  configuration  control,  supervised  documentation,  and  moni¬ 
tored  the  development  of  new  weapon  models.  The  board’s  plans  include 
improving  command,  control,  and  countermeasures  and  establishing  a 
formal  hierarchy  of  como  models. 

In  August  1985,  a  group  representing  most  COMO  users  convened  at  Kirt- 
land  Air  Force  Base  to  share  information  and  develop  strategies  regard¬ 
ing  the  como’s  "standardization’’  and  future  use,  especially  given  the 
newly  developed  transportable  version.  They  discussed  the  need  for  a 
model  resource  group  that  would  assist  individual  groups  by  jointly 
determining  the  need  for  improvements  and  who  should  be  responsible 
for  them.  This  group  would  also  be  responsible  for  preventing  the 
uncontrolled  proliferation  of  como  operating  systems.  They  made  it  clear 
that  the  como  should  be  sufficiently  uniform  that  outputs  will  be  relia¬ 
bly  similar,  regardless  of  the  computer  on  which  a  simulation  is  run. 

The  establishment  of  the  como’s  model  resources  group  in  1986  and 
model  management  board  suggest  that  organizations  do  recognize  the 
need  for  the  management  and  coordination  of  major  modeling  efforts. 
With  the  use  of  the  como  extending  to  matiy  Army,  Navy,  and  Air  Force 
units  and  to  nato  and  its  allies,  the  need  for  such  coordination  is  obvi¬ 
ous.  The  extent  to  which  multiple  applications  of  the  como  will  be  effec¬ 
tively  managed  or  coordinated  by  these  oversight  groups  is  still  not 
clear,  but  that  they  recognize  the  need  for  and  have  attempted  to  coordi¬ 
nate  a  higher  level  of  oversight  or  management  are  important  initial 
steps. 
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tkadoc  establishes  study  advisory  groups  to  monitor  the  progress  of  its 
significant  studies,  and  weapon-system  program  offices  appoint  system 
simulation  working  groups  to  oversee  engineering  simulations.  The 
Army  Air  Defense  Artillery  School  analysts  conducting  the  initial  divad 
cost  and  operational-effectiveness  analysis  acknowledged  in  their  1977 
report  that  the  study  advisory  group  provided  an  open  forum  for  dis¬ 
cussing  disagreements  and  that  overall  the  group  was  helpful  and  pro¬ 
vided  great  assistance.  The  study  advisory  groups  for  the  1 984  cost  and 
operational-effectiveness  update  and  the  1985  comparative  analysis 
were  especially  active  in  directing  the  reconciliation  of  disparities  in  the 
adagk  and  Carmonette  results.  For  example,  when  the  Carmonette  was 
first  used  to  analyze  the  divad  for  the  update  in  late  1983,  its  initial 
results  and  those  of  the  adage  led  to  different  appraisals  of  the  divad  on 
the  battlefield.  This  situation,  along  with  other  identified  or  apparent 
errors,  omissions,  and  anomalies  in  the  models  plus  inconsistencies  in 
the  scenarios  warranted  a  detailed  review  of  both  models  before  they 
were  used  further  in  analyzing  the  divad. 

To  give  insight  into  how  a  model  affects  perceptions  of  a  system's  per¬ 
formance  and  combat  effectiveness,  the  deputy  chief  of  staff  for  combat 
developments  established  a  study  advisory  group  to  correct  problems 
with  the  models,  scenarios,  and  data.  The  following  study  objectives 
were  established: 

•  Describe  and  evaluate  the  adage  and  Carmonette  models  structures,  sce¬ 
narios,  inputs,  data  usage,  and  outputs. 

•  Identify  errors,  omissions,  and  problems  associated  with  the  models, 
including  coverage  of  data,  scenarios,  structure,  and  any  other  question¬ 
able  factors  or  characteristics. 

•  F’rioritize  corrections  by  severity  of  problem,  level  of  difficulty,  time, 
and  resources  required  for  each  correction  or  improvement  or  change  to 
the  models. 

•  Make  changes  where  feasible  within  established  deadlines  and  review 
them  prior  to  production  runs. 

•  Update  the  cost  and  operational-effectiveness  analysis  according  to  the 
run  designs  approved  by  the  study  advisory  group  and  evaluate  the 
results. 

We  believe  that  while  study  advisory  groups  provided  a  quality-control 
check  for  the  simulations  used  in  a  specific  study,  their  involvement  was 
limited  to  short-term  issues.  The  membership  of  the  three  study  advi¬ 
sory  groups  for  the  divad  studies  was  not  always  the  same,  further  limit¬ 
ing  their  ability  to  focus  on  long-term  problems,  such  as  defining 
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validation  requirements  and  working  with  operational  test  organizations 
to  obtain  the  necessary  data. 

For  the  engineering  simulations  for  several  of  the  Stinger  weapons,  sys¬ 
tem  simulation  working  groups  were  established  to  define  validation 
requirements  and  to  review  and  approve  validation  data.  They  were 
chartered  by  the  Stinger  program  office  and  included  representatives 
from  Army  laboratories,  test  and  evaluation  groups,  users  of  simula¬ 
tions,  and  the  contractor.  They  usually  met  three  or  four  times  a  year,  as 
necessary.  They  coordinated  closely  with  the  test  integration  working 
group,  which  had  broader  representation,  to  ensure  that  test  data  neces¬ 
sary  for  validation  were  obtained.  Many  of  the  persons  representing  the 
different  organizations  on  the  system  simulation  working  groups  have 
remained  the  same,  giving  some  continuity  to  their  oversight  functions. 


Documentation 


For  the  three  case  study  simulations,  we  found  varying  levels  of  docu¬ 
mentation.  The  Carmonette  had  very  little  documentation.  The  adagk 
and  como  documentation  were  more  complete.  (We  have  summarized  our 
review  of  the  documentation  in  table  6.3.) 

The  TRArxx'  Systems  Analysis  Activity  provided  us  with  an  executive 
summary  describing  key  features  of  the  Carmonette  and  list  of  input 
elements.  These  were  helpful  for  a  general  understanding  but  incom¬ 
plete  for  answering  many  of  our  questions.  We  also  found  examples  of 
the  problems  created  by  the  lack  of  documentation  in  the  use  of  simula¬ 
tions  in  the  divad  update.  When  that  cost  and  operational-effectiveness 
analysis  was  being  conducted,  the  necessity  and  utility  of  computer  doc¬ 
umentation,  particularly  for  the  Carmonette,  became  apparent  when 
analysts  at  the  Army  Air  Defense  Artillery  School  attempted  to  under¬ 
stand  the  apparent  disparity  between  the  outcomes  of  the  Carmonette 
and  adagk.  The  school's  analysts  repeatedly  expressed  their  concern 
that  a  reasonable  understanding  of  the  internal  function  of  the 
Carmonette  would  be  extremely  unlikely  in  the  absence  of  documenta¬ 
tion,  Although  the  tradoc  Systems  Analysis  Activity  analysts  were 
cooperative  in  providing  detailed  answers  to  specific  questions,  deficien¬ 
cies  were  uncovered  only  through  a  serendipitous  discovery. 

Concern  about  this  lack  of  documentation  for  the  Carmonette  was  also 
expressed  by  various  persons  at  higher  Army  levels,  including  the  gen¬ 
eral  appointed  chairman  of  the  study  advisory  group  for  the  1984 
update.  At  the  first  meeting,  the  Army  Material  Systems  Analysis  Activ¬ 
ity  gave  a  detailed  briefing  on  the  adagk,  describing  formulas,  and 
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tradoc  Systems  Analysis  Activity  presented  general  information  on  the 
Carmonette,  addressing  broad  capabilities.  The  chairman  stated  specifi¬ 
cally  in  the  minutes  that  the  lack  of  documentation  on  the  Carmonette 
was  disappointing.  Although  this  dissatisfaction  was  noted,  we  did  not 
find  that  the  tradoc  Systems  Analysis  Activity  was  directed  to  improve 
its  documentation. 

The  adage  was  better  documented.  However,  the  update  analyses 
resulted  in  changes  that  would  require  commensurate  changes  in  the 
documentation  in  order  to  keep  it  current,  but  the  most  current  docu¬ 
mentation  provided  to  us  is  dated  September  1978,  prior  to  the  update. 
The  Army  officials  whom  we  interviewed  did  not  mention  any  major 
problems  with  the  adequacy  of  the  adage  documentation.  We  were  told 
that  the  two  principal  users,  the  Army  Material  Systems  Analysis  Activ¬ 
ity  and  the  Army  Air  Defense  Artillery  School,  communicated  fre¬ 
quently  about  the  adage’s  functioning  and  changes. 

tradoc  analysts  identified  the  disadvantages  of  documenting  all  existing 
models  as  time  consuming  or  costly  if  done  by  contract  with  civilian 
firms.  Documenting  future  models  adds  to  the  cost  of  a  model's  develop¬ 
ment  and  maintenance.  Other  analysts  noted  that  a  significant  and  prob¬ 
ably  unaffordable  level  of  effort  would  be  required  to  provide  even 
minimum  documentation  for  the  current  Carmonette  simulation. 

The  COMO  has  been  documented  with  more  than  90  reports.  Early  docu¬ 
mentation  was  produced  in  the  late  1960’s  and  early  1970's  at  the  tech¬ 
nical  center  of  the  Supreme  Headquarters  of  the  Allied  Powers  in 
Europe.  Documentation  has  been  produced  since  the  late  1970’s,  primar¬ 
ily  by  or  for  the  Army  Missile  Command  in  developing  and  validating 
individual  weapon  system  models  and  improving  the  como  program 
within  which  the  simulations  are  run. 

We  reviewed  the  main  documentation  produced  for  the  Stinger  and 
found  it  comprehensive  and  detailed.  It  was  essentially  a  combination  of 
the  analysts'  and  programmers'1  manuals.  It  assumed  a  substantial 
knowledge  of  the  total  como  system.  We  did  not  obtain  a  briefing  docu¬ 
ment  on  the  Stinger  simulation,  which  was  described  to  us  as  a  combina¬ 
tion  of  a  gross  overview  and  user-analyst  charts.  It  is  not  known  to  what 
extent  the  differences  between  the  como  III  models  that  are  operated  at 
various  facilities  have  been  documented.  Our  discussion  with  the  devel¬ 
oper  of  the  new  transportable  version  led  us  to  believe  that  such  docu¬ 
mentation  is  limited  and  informal.  There  is  no  listing  of  such  material  in 
an  index  of  como  documents.  Validation  documents  produced  for  the 
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Patriot  and  Hawk  simulations  were  comprehensive  reports  greatly  use¬ 
ful  for  understanding  the  validation  efforts  and  the  strengths  and  limi¬ 
tations  of  the  models,  but  no  such  documents  were  available  for  the 
Stinger. 


Reporting 


We  reviewed  the  major  reports  containing  the  Carmonette  and  adagk 
results  for  the  divad  cost  and  operational-effectiveness  analyses.  For  the 
como,  we  reviewed  the  Stinger  battery-coolant-unit  usage  study  report 
and  report  on  the  validation  of  the  Patriot  model.  We  looked  for  infor¬ 
mation  that  explained  a  simulation’s  purpose,  its  theoretical  basis,  and 
its  capabilities  and  limitations.  (We  have  summarized  our  findings  in 
table  (5.4.) 


The  1977  DIVAD  Report  The  purposes  of  the  initial  divad  cost  and  operational-effectiveness  anal¬ 
ysis  reported  in  1977  were  to  determine  if  there  was  a  mission  need  for 
low-altitude  air  defense  of  mobile  combat  forces  deployed  near  the  for¬ 
ward  edge  of  battle  and  to  evaluate  the  merits  of  systems  that  might  fill 
this  need  and  to  recommend  one.  The  report  addressed  1 1  study  objec¬ 
tives  ranging  from  gathering  basic  data  describing  enemy  threat  systems 
and  defending  forces  through  developing  alternatives  to  the  cost-effec¬ 
tiveness  justification  of  any  proposed  new  gun.  Essential  elements  of  the 
analysis  were  also  clearly  stated. 

The  1977  report  clearly  established  the  theoretical  importance  of  study¬ 
ing  air  defense  in  a  division  context  and  considered  the  support  of  com¬ 
bat  forces  at  the  forward  edge  of  battle  and  the  defense  of  critical  assets 
in  the  rear.  The  report  clearly  stated  its  underlying  assumptions.  The 
report  included  reduction  in  damage  to  division  assets  as  a  measure  of 
effectiveness,  thus  addressing  an  essential  function  of  air  defense.  It 
explicitly  stated  that  the  Campaign  is  an  expected-value  submodel  and 
described  the  model's  logic  and  the  relationship  between  the  Incursion 
and  Campaign  submodels.  This  report  also  described  how  the  air-to-air 
and  ground  battle  results  were  integrated  into  the  overall  calculation  of 
battle  results.  The  report  indicated  that  ground  battle  damage  was  gen¬ 
erated  from  sources  external  to  the  adagk  but  did  not  describe  those 
sources  in  detail.  This  is  important,  since  the  adagk's  portrayal  of  the 
ground  war  was  controversial.  The  portrayal  of  the  ground  battle  was 
clearly  labeled  an  external  element  to  the  adagk,  but  implications  result¬ 
ing  from  errors  in  the  ground  battle  portrayal,  such  as  the  effect  on 
measures  of  effectiveness,  were  not  reported. 
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The  1 984  Carmonette 
Report 


The  purpose  of  using  the  Carmonette  in  the  cost  and  operational-effec¬ 
tiveness  update  reported  in  1984  was  to  determine  the  operational  effec¬ 
tiveness  of  the  divad  gun  in  a  European,  combined  arms,  main  battle 
area.  The  background  statement  for  the  report  indicated  that  analysis 
using  the  Carmonette  was  to  investigate  the  probability  of  all  ground 
units  participating  in  air  defense,  but  this  objectiv  e  was  not  reported. 
Three  objectives  were  addressed:  ( 1 )  the  determination  of  the  opera¬ 
tional  effectiveness  of  the  divad,  (2)  the  determination  of  the  character¬ 
istics  of  the  divad  necessary  to  achieve  successful  engagements,  and  (3) 
the  determination  of  the  potential  contribution  of  other  friendly  ground 
forces  to  the  air  defense  role.  Essential  elements  of  the  analysis  were  not 
clearly  delineated. 

The  Carmonette  1984  report  did  not  describe  the  theoretical  basis  for 
the  analysis  very  well.  An  understanding  of  the  ground  war.  a  major 
portion  of  the  Carmonette.  seems  to  have  been  taken  as  a  given,  since 
the  ground  war  was  hardly  discussed.  The  main  report  neglected  to  men¬ 
tion  the  magnitude  or  implications  of  the  notable  differences — pre¬ 
sented  in  an  appendix — between  the  air  defense  burst-fire  submodel 
used  in  the  Carmonette  and  a  similar  model  used  by  the  Army  Material 
Systems  Analysis  Activity.  The  report  mentioned  the  modification  of  the 
Carmonette  to  reflect  realistic  play  of  the  gun,  but  it  described  the 
divad's  modes  of  operation  rather  than  the  Carmonette's  modeling  of 
these  features.  Although  it  did  discuss  how  the  Carmonette  classified 
and  prioritized  targets,  there  was  only  minor  mention  of  firing  doctrine. 

V  isual  acquisition  was  discussed  and  the  3-kilometer  limitation  for 
visual  detection  and  the  substitution  of  forward-looking  infrared  sen¬ 
sors  were  mentioned.  However,  the  report  made  a  recommendation 
about  equipping  the  divad  with  a  forward-looking  infrared  sensor  when 
the  divad's  performance  without  that  capability  had  not  been  studied 
because  of  limitations  in  the  Carmonette. 

The  report  explicitly  listed  the  following  limitations. 

fixed-wing  aircraft  are  not  addressed, 

high-  and  medium-altitude  air  defense  systems  are  not  addressed,  and 
divad  radar  signature  disclosing  the  location  of  friendly  forces  is  not 
addressed. 

An  implicit  limitation  is  computer  time  as  reflected  in  the  number  of 
replications  for  determining  the  stability  of  the  results.  Clearly  unstable 
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The  1984  ADAGE  Report 


results  were  accepted  and  not  enough  information  was  presented  to 
evaluate  their  extent  or  effect. 


Since  the  Army  Air  Defense  Artillery  School  has  not  yet  issued  a  formal 
report  delineating  the  results  of  the  apagk  modeling  for  the  1984  update, 
we  reviewed  a  draft  of  the  proposed  report.  The  stated  purpose  of  the 
update  analysis  was  to 

•  determine  if  the  i>[\ai>  gun  was  still  cost  effective  and  operationally 
effective  and 

•  analyze  various  force  structures,  including  alternatives  in  order  to  rec¬ 
ommend  the  preferred  air  defense  artillery  weapons. 

The  purpose  of  the  simulation  was  to  quantify  changes  in  losses  to  the 
division  from  fixed-wing  aircraft  and  helicopter  attack  caused  by  varia¬ 
tions  in  the  gun's  performance  in  different  operating  modes  with  differ¬ 
ent  performance  capabilities  based  on  observed  performance  and 
forecasted  capability.  The  report  updated  the  enemy  air  threat  capabil¬ 
ity  and  posture.  The  adage's  primary  purpose  of  studying  air  defense  in 
the  context  of  a  combat  division  was  stressed  again,  and  the  protection 
of  forward  combat  units  and  the  interdiction  of  fixed-wing  aircraft  and 
helicopters  Hying  by  the  main  battle  area  to  the  division's  rear  were  also 
stressed. 


The  1984  report  described  the  model's  logic  less  rigorously  than  the* 

1977  rejtort.  While  the  later  report  described  the  relationship  between 
the  Incursion  and  Campaign  submodels,  it  did  not  describe  in  detail  how 
air-to-air  and  ground  battle  results  were  integrated  into  the  overall  cal¬ 
culation  of  battle  results.  Changes  to  the  adage  model  made  after  the 
original  report  were  described,  including  changes  to  correct  problems 
that  were  uncovered  and  to  improve  the  efficiency  of  the  model's 
processing  capability.  To  address  criticisms,  the  report  contained  a  sec  ¬ 
tion  reconciling  the  adagk's  results  with  results  produced  by  the 
Carmonette  and  other  traixk-  models  of  air  defense1,  showing  the  adage 
comparing  favorably  with  the  other  models. 

The  1984  update  was  not  as  explicit  as  the  original  report  in  describing 
the  underlying  assumptions  of  the  basic  analysis.  However,  it  included 
analyses  of  alternative  air  defense  artillery  force  structures,  and  their 
assumptions  and  limitations  were  clearly  stated.  A  feature  of  the  update 
not  present  in  the  first  report  was  a  description  of  nonquantifiable  areas 
that  were  indicated  as  further  supporting  the  need  for  the  divad. 
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The  1 985  Carmonette 
Report 


In  October  1985,  tkaixx’  Systems  Analysis  Activity  published  its  report 
on  the  DIVAD  comparative  analysis,  the  purpose  of  which  had  been  sim¬ 
ply  stated  as  to  compare  the  divad  to  selected  alternatives  by  means  of 
the  Carmonette  model.  There  were  four  objectives: 

obtain  the  most  current  information  on  the  performance  data  for  each 
air  defense  weapon-system  alternative  considered  in  the  analysis; 
determine  the  operational  effectiveness  of  each  alternative; 
compare  the  operational  effectiveness  of  the  divad  and  its  near-term 
alternatives  at  two  different  levels  of  daytime  visibility;  and 
analyze  the  effects  of  a  particular  alternative  as  a  replacement  for  the 
divad  in  the  air  defense  force. 

The  assumptions  stated  in  the  report  clearly  indicated  that  the 
Carmonette  analysis  was  limited  to  the  study  of  battalion  air  defense. 
The  report  also  indicated  that  command,  control,  and  communications 
and  iff  were  not  to  be  played  in  the  analysis. 

The  comparative  analysis  report  gave  only  a  cursory  description  of  the 
theoretical  foundations  of  the  Carmonette  model.  Limitations  from  the 
previous  Carmonette  analysis  were  still  not  adequately  addressed:  high- 
and  medium-altitude  air  defense  systems  were  not  mentioned,  the  prob¬ 
lem  of  the  divad  radar  signature’s  disclosing  friendly  forces  was  not 
addressed,  and  the  report  coverage  of  enemy  fixed-wing  aircraft  was 
misleading.  The  report  implied  that  the  Carmonette  model  included  a 
submodel  for  handling  attacking  fixed-wing  aircraft,  whereas  results  for 
fixed-wing  aircraft  engagements  with  most  air  defense  weapons  were 
based  on  data  from  another  computer  model.  (The  Carmonette  nowr  has 
a  fixed-wing  submodel  that  it  did  not  then  have.)  Nothing  was  men¬ 
tioned  about  computer  time,  but  the  analysis  was  again  based  on  few 
replications,  and  clearly  unstable  results  were  accepted  for  some  impor¬ 
tant  factors. 

The  following  quotation  from  the  comparative  analysis  report  casts  con¬ 
siderable  doubt,  in  our  opinion,  on  the  adequacy  of  the  Carmonette  for 
studying  air  defense  alternatives: 

"The  force  effectiveness  shows  little  discrimination  between  the  air  defense  alterna¬ 
tives  because  of  the  small  number  of  air  defense  units  at  the  battalion  level,  the 
ineffective  HKD  ghat  is,  enemy]  fixed  wing  capability  played  in  this  scenario, 
because  a  number  of  other  ground  systems  made  up  some  of  the  difference  in  kills 
against  HKD  helicopters  and  the  relatively  ineffective  HKD  helicopters  that  were 
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attempting  to  acquire  and  engage  BLl'E  [that  is.  friendly]  units  in  a  defensive  pos¬ 
ture."  (Infantry  System  Division,  Army  TKADOC  Systems  Analysis  Activity,  Sgt 
Y'ork  Comparative  Analysis  (White  Sands  Missile  Kange.  \  M  October  1 985 ).  p.  v 
Underscoring  added  ) 


The  1985  ADAGE  Report  As  with  the  1984  update,  no  official  reports  of  the  adage  comparative 

analysis  were  issued  and  our  comments  here  apply  only  to  a  November 
1985  draft  report.  Its  purpose  and  objectives  were  essentially  the  same 
as  those  for  the  Carmonette  report.  The  objective  that  differed  was  to 
use  the  adage  to  conduct  a  parametric  analysis  of  the  divad  to  determine 
both  its  sensitivity  to  variations  in  performance  parameters  and  how 
much  these  would  have  to  be  degraded  to  make  the  divad  no  longer  pre¬ 
ferred  among  the  alternative  weapon  systems  being  compared. 

The  report  contained  a  simple  description  of  the  adage  model  that  was 
not  nearly  as  complete  as  the  description  in  the  original  analysis. 
Although  the  report  clearly  emphasized  the  importance  of  a  division 
context  for  studying  the  effectiveness  of  an  air  defense1  weapon  and 
assets  preserved  as  a  measure  of  effectiveness,  the  results  tended  to 
concentrate  on  the  effectiveness  of  protecting  forward  combat  units 
rather  than  the  whole  division.  This  resulted  from  a  direction  from  the 
Army’s  deputy  undersecretary  for  operations  research  to  simulate  test 
results  from  follow-on  evaluation  tests  conducted  at  Fort  Hunter  Lig¬ 
gett,  which  played  only  a  battalion  task  force  along  the  forward  edge  of 
battle.  Another  limitation  of  the  adage  mentioned  in  this  report  was  its 
inability  to  portray  mission  aborts,  a  feature  that  was  modeled  in  the 
Carmonette.  Finally,  the  adage  report  discussed  a  limitation  that  was 
also  true  for  the  Carmonette  analyses — the  inability  to  simulate  human 
factors  displayed  at  the  Fort  Hunter  Liggett  follow-on  evaluation. 

The  adage  modelers  tried  to  reconcile  their  division-level  results  with 
the  Carmonette's  battalion-level  results,  but  the  inconsistencies  could 
not  be  completely  reconciled.  The  report  attributed  the  differences  to 
two  basic  sources: 

•  differences  in  the  models,  especially  the  suppression  of  enemy  helicop¬ 
ters  and  mission  aborts  present  in  the  Carmonette  but  not  in  the  adage, 
and 

•  the  adage's  portrayal  of  a  2-day  division  battle  compared  to  the 
Carmonette’s  portrayal  of  a  25-minute  battalion  battle  whose  outcome 
was  very  dependent  on  t  he  scenario. 
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The  COMO  Reports 


Summary 


The  a  hack  report  concluded  by  recommending  that  the  divad  should  be 
bought. 


The  purpose  of  the  Stinger  battery-coolant-unit  usage  simulation  was  to 
assist  the  Stinger  project  office  in  determining  the  appropriate  number 
of  battery  coolant  units  that  should  be  available  to  the  Stinger  team  in 
wartime.  The  report  clearly  developed  the  rationale  for  the  scenarios 
that  were  used,  and  it  identified  limitations  stemming  from  both  the 
computer  and  the  model.  The  description  of  the  model  was  detailed 
enough  to  allow  an  analyst  to  examine  its  operation  fully.  However,  the 
divad  and  threat  aircraft  models  were  not  discussed  in  equivalent  detail. 
Thus,  the  main  limitation  of  the  report  was  the  implicit  acceptance  by 
the  analyst,  and  necessarily  the  user  of  the  results,  that  the  divad  simu¬ 
lation  was  valid  and  that  the  overall  results  would  not  be  biased  by  an 
inaccurate  divad  model.  In  fact,  the  use  of  the  unit  appeared  to  be  notice¬ 
ably  different  when  the  divad  was  part  of  the  scenario,  suggesting  that 
the  divad  had  an  effect  on  its  usage.  Fortunately,  these  differences  were 
small  in  the  aggregate  and  would  not  appear  to  affect  the  decision 
options  available  to  the  Stinger  project  office. 

Our  discussion  of  the  Patriot  validation  was  based  on  a  document  enti¬ 
tled  Benchmarking  the  como  III  Baseline  Patriot  with  Simpler  como  III 
Generic  Models.  The  purpose  of  the  document’s  analysis  was  clearly 
stated.  Differences  from  varying  theoretical  approaches  and  practical 
implementation  were  clearly  explained  and  guidance  was  provided  as  to 
whether  differences  could  be  reconciled  by  modifications  and  correc¬ 
tions  or  were  essentially  a  result  of  the  theory  under  which  the  particu¬ 
lar  model  was  developed.  The  document  is  an  outstanding  example  of 
reporting  on  comparison  and  validation. 


The  third  area  of  concern  in  the  assessment  framework  we  sketched  in 
table  2. 1  deals  with  the  institutional  practices  used  to  establish  and 
maintain  the  credibility  of  a  simulation’s  results.  We  looked  for  the  qual¬ 
ity-control  mechanisms  used  for  the  adage  and  Carmonette  in  the  divad 
cost  and  operational-effectiveness  analyses  and  for  the  como  in  the 
Stinger  analyses.  Our  observations  cannot  be  generalized  beyond  these 
three  cases.  We  found  several  examples  of  efforts  to  improve  a  simula¬ 
tion’s  credibility  as  well  as  problems  that  arose  when  such  efforts  were 
not  made. 
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^,iartCk  of  documentation  appears  to  be  the  greaU^n^lem  The  frus¬ 
trating  experience  of  trying  to  use  the  Carmonette  without  sufficient 
documentation  for  the  mvad  is  one  example. 

Oversight  groups  for  the  Stinger  appear  to  have  been  explicitly  con¬ 
cerned  with  validity,  defining  validation  requirements  inadvance  and 

ZZT™  d!v*Tnt!U  testere  t0  ensure  th«  ««  Z 

available^The  study  advisory  groups  for  the  divad  also  appeared  to  be 
concerned  about  credible  results  but  focused  more  on  assumpthms 
ff'f"’  and  lnputs  and  on  comparing  the  results  with  those  <Jf  other  sim- 
tem^nS'  Jhey  ®ppeared  neither  to  define  validation  requirements  svs 

ss  rjsr to  w°rk  w,th  the  «« 

Most  of  the  reports  that  we  reviewed  explicitly  described  major  omis¬ 
sions  in  the  simulations.  The  Carmonette  report  did  not  however 
include  some  of  the  lesser  limitations  that  may  not  be  severe  by  them- 
s  Ives  but  cumulatively  may  seriously  damage  credibility. 
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report  text  appear  at  the 
end  of  this  appendix 


See  comment  1 


See  page  15 


I 


i 


OPERATIONAL  TEST 
AND  E  V  ALU  AT  ION 


OFFICE  OF  THE  SECRETARY  OF  DEFENSE 

WASHINGTON  DC  20101  1700 


6  OCTOBER  1987 


Mr.  Frank  C.  Conahan 
Assistant  Comptroller  General 
National  Security  and  International 
Affairs  Division 
U.S.  General  Accounting  Office 
Washington,  D.C.  20548 

Dear  Mr.  Conahan, 

This  is  the  Department  of  Defense  (DoD)  response  to  the  General 
Accounting  Office  (GAO)  Draft  Report  "DoD  SIMULATIONS:  Lack  of 

Assessment  Procedures  Threatens  Credibility  of  Results,"  dated  July 
2,  1  987  (GAO  Code  97J19S/OSU  Case  73 3b J . 

The  DoD  has  reviewed  the  draft  report  for  technical  adequacy  and  foi 
application  to  current  or  future  weapon  system  acquisitions.  The 
report  is  generally  technically  correct.  However,  because  this  is  a 
large  and  complex  subject,  the  scope  and  focus  of  the  report  needs 
to  be  better  defined,  otherwise  the  report  will  be  misleading  with 
respect  to  the  use  of  simulations  i n  the  acquisition  process. 
Technical  corrections  and  additions  are  noted  m  the  enclosure. 

With  minor  modifications  the  DoD  concurs  with  the 
recommendations  and  is  continuing  efforts  in  this  iitipoilant  jrea. 

In  the  findings,  the  GAO  makes  the  following  statement:  "...one 
limitation  of  this  approach  (case  study  method)  is  that  it  prevents 
generalizing  from  the  findings....”  The  DoD  review  indicates  this 
was  done,  however.  DoD  comments  are  that  the  findings  and 
recommendations  may  be  valid  with  respect  to  most  high  level  loin; 
on  force  models.  The  models  that  are  used  lor  simulations  below  the 
"war  gaming"  level  need  to  be  considered.  Without  these  additional 
mode  1 s /s i mu  1  a 1 1 ons  being  considered,  unexpected  and,  in  many  cases, 
incorrect  predictions  will  result. 


i 


The  DoD  finds  the  examination  of  tlnee  models  and  two  weapon 
systems  is  neither  a  large  nor  broad  enough  sample  size  tor  the 
extrapolation  suggested.  Final  acquisition  decisions  are  based  on 
many  factors.  DoD  Directive  5000.3  indicates  the  use  of  simulation 
data  is  a  subset  of  the  test  portion  of  the  decision  process.  The 
part  the  three  simulations  played  in  the  acquis’ l  ion  decisions  of 
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the  two  programs  evaluated  by  the  GAO  was  not  addressed  in  the 
report.  The  14  factors  for  evaluating  simulations  are  a  useful 
tabulation,  but  must  include  the  caveat  to  take  into  account  the 
final  use  (from  an  acquisition  standpoint)  of  the  simulation 
results.  The  GAO  factor  checklist  is  inappropriate  to  evaluate  a 
simulation  model's  credibility  separate  from  its  people,  application 
and  input  data.  It  is  disappointing  that  the  report  does  not  have  a 
consistent  scope  and  focus.  Models  are  only  tools.  A  consistent 
and  useful  focus  of  the  report  would  be  the  quality  of  workmanship 
which  depends  on  the  combination  of  qualified  people,  model 
application,  input  data  choice  and  the  assessment  of  the  results. 

The  report  recommends  that  the  DoD  provide  formal  guidance  for 
the  use  of  models/simulations  in  the  acquisition  process.  The 
Military  Operations  Research  Society  (MORS)  has  devoted  a  great  deal 
of  attention  to  this  area.  The  monograph  on  Military  Model ing ,  for 
example,  gives  useful  guidance  on  the  subject  of  ve  r i f  ica t  i on  and 
validation.  The  state  of  the  art  is  still  evolving  and  is  still 
subject  to  discussion  and  contention.  These  aspects,  therefore,  may 
not  be  ready  for  treatment  by  formal  regulation.  The  14  GAO  factors 
have  been  forwarded  to  the  Services  for  review  and  evaluation. 

The  Oefense  Systems  Management  College  (DSMC)  includes  this  area 
in  the  Test  and  Evaluation  Management  Course.  The  manual  for  this 
course  includes  a  chapter  on  the  use  of  simulations  in  test  and 
evaluation.  This  course  will  be  offered  starting  this  December. 

The  OTijE  Commanders'  Conference  also  included  a  review  of  this 
subject. 

The  detailed  DoD  comments  on  the  report  findings  and 
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GAO  DRAFT  REPORT  -  DATED  JULY  Z,  1987 


"DOD  SIMULATIONS:  LACK  OF  ASSESSMENT  PROCEDURES  THREATENS 
CREDIBILITY  OF  RESULTS" 


|  (GAO  CODE  9731 9 S )  OSC  CASE  7336) 

DOD  RESPONSE  TO  GAO  DRAFT  REPORT 

i  ******** 

FINDINGS 

o  FINDING  A:  Simulations.  The  GAO  noted  that  mu  1 1 l  - b i 1 1 i on 
dollar  acquisition  decisions  for  major  weapon  systems  should, 
ideally,  be  based  on  testing  the  operational  performance  of  weapons 
under  conditions  that  replicate  actual  combat;  however,  as  weapon 
systems  have  become  more  complex  and  expensive,  the  practicality  of 
subjecting  them  to  the  necessary  number  of  such  tests  has  dimin¬ 
ished.  The  GAO  observed  that  one  alternative  has  been  to  turn  to 
the  use  of  simulation  models  to  provide  additional  information 
regarding  performance.  The  GAO  noted  that  simulations  can  be,  and 
often  are,  used  throughout  the  life  cycle  of  a  weapon  system  and  are 
i  frequently  used  in  connection  with  other  analytical  methods  and 

field  experimentation.  The  GAO  found  that  the  overriding  advantage 
of  simulation  is  perhaps  the  opportunity  to  investigate  questions 
and  problems  that  would  otherwise  not  be  addressed  and  to 
investigate  them  systematically  with  numerous  replications  under 
controlled  conditions.  On  the  other  hand,  the  GAO  noted  that 
simulation  has  disadvantages.  The  GAO  explained  that,  because  a 
model  is  only  an  approximation,  not  the  equivalent,  of  a  real 
system,  inaccurate  assumptions  about  a  weapon  or  its  environment  may 
cause  the  results  of  a  simulation  to  diverge  from  reality.  The  GAO 
concluded  that  although  simulations  are  useful  tools,  they  are 
always  approximations  to  reality  and,  therefore  their 
c red i bi 1 i t y- - 1 he  level  ot  confidence  that  a  dec i s i on -make r  should 
have  in  their  results--is  open  to  question.  1pp.  1  -  J  Executive 
Now  pages  1  and  10-12  Summary;  1-1  to-l-b/GAO  Draft  Report) 

DOU  KESl’ONSE:  Concur. 

Tt  shou  id  Be  noted,  however,  that  acquisition  decisions  are  based  on 
many  factors  and  simulation  results  aie  considered  to  be  a  subset  of 
developmental  and  operational  tests.  The  Army  used  the  three 
i  simulation  models  (used  to  evaluate  the  GAO  factors)  because  ot  the 

battlefield  complexity  as  well  as  the  complexity  of  0 1 V Alt  and 
STINGER.  Many  levels  ot  simulations  are  used  by  DOD  in  the 
acquisition  of  weapon  systems.  These  simulations  lange  from 
I  subsystem  simulations  to  complex  weaoon  systems  on  to  wargame 

j  simulations  to  determine  weapons  systems  requirements. 

j  o  FINDING  B :  Factors  In  Assessing  A  Simulation's  Credibility. 

The  GAO  observed  that  various  procedures  have  been  proposed  to 
permit  reasoned  judgment  conceimng  the  credibility  of  simulation 
results.  Based  on  a  review  of  the  literature  and  consultations  with 
the  simulation  experts,  the  GAO  deieloped  a  framework  of  14  factors 
that  the  GAO  concluded  are  necessary  to  address  in  an  attempt  to 
assess  the  credibility  of  a  simulation.  The  GAO  reported  that  these 
factors  fall  under  three  broad  areas  of  concern  regarding 
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Now  pages  1 7-22  and  60 
See  comment  2 


si mula t ions- -( 1 J  theory,  model  design,  and  input  data,  (2)  model 
validation,  and  (3)  model  management  and  reporting.  The  GAO 
observed  that  severe  limitations  in  any  of  these  three  areas  would 
lead  to  doubts  about  the  credibility  of  a  simulation,  but  for 
different  reasons.  As  examples,  the  GAO  cited: 

problems  with  the  first  area  of  theory,  model  design  or  input 
data  would  pose  questions  about  the  basic  integrity  of  the 
simulation's  internal  structure; 

little  or  no  evidence  in  the  second  area  of  model  validation 
would  leave  a  user  with  insufficient  proof  of  the  extent  to  which 
the  simulation  represents  reality; 

the  absence  of  efforts  in  the  third  area  would  create  doubts 
that  good  practices  had  been  used  to  assure  quality  in  the  first  two 
areas,  that  the  continuing  integrity  of  the  model  is  assured,  and 
that  critical  limitations  had  been  properly  disclosed. 

The  GAO  concluded  that  the  overall  framework  is  feasible  to  apply 
and  will,  at  least  for  operational  effectiveness  simulations, 
provide  a  structured,  useful  way  to  review  the  credibility  of 
simulation  results.  (pp.  2-1  to  2-11,  pp.  8-1  to  8-2/GAO  Draft 
Report  ) 

POD  RESPONSE:  Partially  concur. 

Prom  a  textbook  point  of  view  the  three  areas  of  assessment  and  the 
14  factors  certainly  should  be  considered  in  the  development  and  use 
of  simulations.  As  a  practical  measure,  however,  large  scale 
simulations  of  complex  systems  or  large  force  simulations  do  not 
easily  lend  themselves  to  the  total  level  of  validation  suggested. 
Consistency  of  results  also  is  not  always  an  indicator  of  good 
results.  The  UoU  considers  these  shortcomings  in  the  use  of 
simulation  data  in  its  acquisition  decisions.  In  addition,  people, 
simulation  application,  and  type  of  input  data  also  need  to  be 
cons l de  red . 

o  Finding  C:  The  Case  Study  Simulations.  The  GAO  reported  that 
in  order  to  examine  general  purpose  models,  consideration  was 
restricted  to  models  that  had  the  capability  of  simulating  several 
types  of  weapons.  The  GAO  reported  that  it  judgmental  ly  selected 
two  Army  antiaircraft  defense  systems:  the  portable,  s hou 1 de r - f l red , 
infrared  surface-to-air  STINGER  missile  and  the  division  air  defense 
gun  (D1VAU),  a  surface-to-air  radar-guided  gun  on  a  tracked 
vehicle.  For  these  two  weapon  systems,  the  GAO  selected  three 
Simulations;  tile  COMO  111  model  for  the  STINGER  and  the  Carmonette 
and  Air  Defense  A l r - 1 o - Ground  Engagement  (ADAGE)  models  for  the 
DlVAD.  The  GAO  reported  key  features  of  the  simulation  models,  as 
foil ows : 

the  ADAGE  model  is  a  functional  Simulation  used  to  study  the 
relative  effectiveness  of  combinations  of  air  defense  weapons  in  a 
division; 
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the  Carmonette  is  an  event -sequenced ,  comb  1 ned -arms  combat 
model  that  simulates  small-unit,  ground  combat  involving  the  actions 
of  individual  soldiers  and  weapons;  and 

the  COMO  III,  used  primarily  for  studies  of  tactical  air 
defense,  is  a  Monte  Carlo  simulation  in  which  particular  submodels 
are  aggregated  to  simulate  a  specific  air  defense  environment. 

The  GAO  concluded  that  the  case  study  method  was  the  most  plausible 
for  illustrating  application  of  the  framework.  The  GAO  further 
concluded,  however,  that  one  limitation  of  this  approach  is  that  it 
prevents  generalizing  from  the  findings  beyond  the  three  cases. 

Now  pages  14  and  23-28  (PP-  1_9-  PP-  t0  J-1Q/GA0  Draft  Report) 

POD  RESPONSE'.  Concur. 

o  FINDING  D:  The  Credibility  Of  Selected  Simulations  Based  Upon 
Theory,  Model  Design  and  Input  PataT  The  GAO  noted  that  since  the 
c  red  ibi 1 1 ty  of  simulation  results  Ts  relative  to  the  intended  use, 
the  results  from  a  model  that  is  misapplied  will  not  be  credible, 
even  though  the  model  itself  is  sound.  The  GAO  found  that,  in 
almost  all  instances,  the  case  study  simulations  had  some 
limitations.  With  regard  to  matters  of  the  weapon- target  engagement 
after  detection  had  occurred,  the  GAO  found  that  the  evidence 
indicated  that  all  three  simulations  had  considerable  capability. 

The  GAO  noted  that  this  also  appears  to  be  the  case  for  all  three 
models  in  simulating  important  aspects  of  measures  of 
,  effectiveness.  The  GAO  concluded  that  for  some  of  the  other  areas, 

the  effort  requireu  to  remove  them  might  be  relatively  minor,  while 
for  others,  much  more  work  could  be  required.  The  GAO  further 
concluded  that,  in  a  few  cases,  an  effort  to  fix  the  model  may  not 
I  really  be  the  appropriate  response  and  instead,  using  a  different 

Now  paaes 29  and  39  I  model  might  be  more  appropriate.  (p.  4-1,  pp.  4-11  to  4-12  GAO 

a  I  Praf  t  Report ) 

POD  RESPONSE :  Concur. 

The  GAO  reports  elsewhere  that  the  three  simulations  were  useful. 
They  were,  however,  only  a  small  set  of  the  tools  used  to  provide 
inputs  to  the  decision  process.  It  should  be  recognized  that  the 
Army  does  have  a  mechanism  to  update  the  simulations  as  more 
information  becomes  available,  and  does  endeavor  to  use  the  proper 
simulation  for  each  task,  if  such  a  simulation  exists. 

o  FINDING  E:  The  Match  Between  The  Theoretical  Approach  And  The 
Questions  Posed.  The  GAO  found  that  the  ADAGE,  des igned  as  a 
tunc  t i ona 1  air  defense  model,  was,  in  general,  a  reasonable  choice 
for  estimating  the  effectiveness  of  the  U1VAD.  In  contrast  to 
ADAGk,  the  GAO  noted  that  Ca r mo ne l t e- wa s  designed  to  answer  broad 
tradeoff  questions  which  go  beyond  the  issues  of  an  air  arms  model. 
The  Carmonette  is  generally  not  well-suited  to  answering  the  kind*, 
of  questions  that  were  posed  about  the  D1VAU.  Spec  l  f  l  c a  1  1  v  ,  the  GAu 
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Nov  pages  29-30  and  68  69 


Now  pages  30  31 

I 

j 

I 


found  that  the  model  attempts  to  portray  an  overall  ground  battle 
with  limited  air  war  features,  but  it  is  not  focused  on  individual 
weapon  systems.  The  GAO  further  found  that,  in  general,  the  COMO 
III  model  was  properly  matched  to  the  questions  asked  and  was  based 
on  a  scenario  that  represented  official  policy  at  the  U.S.  Army  Air 
Defense  Artillery  School  (USAADASHC).  The  GAO  concluded  that,  while 
Carmonette  has  sound  tneory  for  a  combined  armed  analysis,  its 
approach  is  not  the  most  appropriate  for  decisions  regarding 
competing  air  defense  weapons.  The  GAO  further  concluded  that  ADAGE 
and  COMO  were  designed  with  such  decisions  in  mind.  (pp.  4-2  to 
4-3;  p.  3,  Appendix  II/GAO  Draft  Report) 

POD  RESPONSE:  Concur. 

It  must  noted,  however,  that  functional  models  do  not  lend 
themselves  to  use  in  combined  forces  examinations;  therefore,  any 
comparison  is  not  appropriate. 

o  FINDING  F:  Coverage  Of  Operational  Measures  Of  Effectiveness. 
Tile  GAO  found  that  both  the  ADAGE  and  the  CARMONETTE  simulations 
provide  for  the  coverage  of  protection  of  critical  assets  to  some 
degree,  although  the  former  emphasized  protection  of  assets,  whereas 
the  latter  emphasized  measures  of  aircraft  attrition.  The  GAO 
further  found  that,  although  the  COMO  111  simulation  concentrated  on 
both  at t r l t i on- type  measures  and  logistics  measures,  it  was  more 
limited  in  its  ability  to  use  preservation  of  assets  deployed  in  the 
forward  area  as  a  principal  measure  of  effectiveness  because  the 
ground  war  is  not  simulated.  The  GAO  concluded  that  this  condition 
threatens  the  credibility  of  the  results  of  this  simulation.  While 
all  the  models  address  operational  measures  of  effectiveness,  the 
GAO  concluded  that  ADAGE  appears  to  relate  its  measures  more  closely 
to  the  ultimate  missions  of  air  def ense - -pro t ec 1 1 on  of  assets  - -whi  1  e 
the  other  models  stress  1 oss -exc hange  ratios.  (pp.  4-3  to  4-4  GaO 
Pratt  Report) 


POD  RESPONSE ;  Concur. 

While  this  is  a  correct  assessment  of  the  proper  utilization  of  the 
three  simulation  models,  the  distinction,  is  the  subjective 
weighting  of  results.  The  models  are  otherwise  equivalent. 

o  FINDING  G:  Portrayal  Of  The  Weapon  System's  Immediate 
Env l ronaent .  The  GAO  observed  that,  with  some  limitations,  the  ADAGE 
and  the  COMO  III  models  were  capable  of  simulating  the  weapon 
systems'  immediate  environment  across  the  five  attributes  of  battle 
(i.e.,  level  of  battle,  length  of  battle,  targets, 
dep 1 oyment /movement ,  and  terrain).  The  GAO  found  that  both  were 
strong  in  characterizing  the  desired  size  of  the  battle  and  the  full 
range  of  targets.  Specifically,  the  GAO  reported  that  the  ADAGE 
model  simulated  longer  battles  but  was  limited  by  its  uniform  and 
static  deployment  of  weapons.  On  the  other  hand,  the  GAO  reported 
that  COMO  III  portrayed  a  shorter  battle  with  STINGER  weapons  that, 
while  realistically  deployed,  did  not  move,  a  limitation  for 
man-portable  systems  for  which  movement  provides  a  form  of  defense, 
but  at  a  cost  of  decreased  operability.  The  COMO  111  and  ADAGE 
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models  used  different  approaches  to  portray  terrain;  however,  the 
GAO  noted  that  neither  approach  is  obviously  superior.  The  GAO 
found  that  Carmonette  was  more  limited  in  its  ability  to  portray  the 
immediate  environment  than  ADAGE  or  COMO  III.  As  an  example,  the 
GAO  cited  that  the  small  battalion  size  and  short  length  of  the 
Carmonette  battle  were  inappropriate  for  the  U1VAD  weapon  and  the 
lack  of  fixed  wing  aircraft  targets  for  most  of  the  analyses 
contributed  to  an  appropriate  target  set.  While  these  limitations 
were  partially  offset  by  Carmonette's  realistic  portrayal  of 
deployment,  movement  and  terrain,  the  GAO  concluded  that  these 
limitations  of  Carmonette's  portrayal  of  the  immediate  environment 
threaten  its  credibility.  The  GAO  concluded  that  all  of  the  models 
|  are  restricted  and  incomplete  in  some  respect  in  their  coverage  of 

|  the  surrounding  combat  arena.  (pp.  4-4  to  4-6;  p.  4,  Appendix 

Now  pages  31-33  and  deleted  |  I1/GA0  Draft  Report) 

POD  RESPONSE :  Concur. 

As  ind  icated  by  GAO  (See  Finding  A),  simulations  are  not  exact 
j  replicas  of  systems  or  battles.  The  limitations  are  acknowledged 

)  and  taken  into  account,  as  necessary.  Two  questions  were  of 

i  interest,  however:  Does  the  system  do  what  it  is  supposed  to  do? 

Does  it  also  provide  a  positive  value  in  the  battlefield  when  used 
:  with  other  systems? 

o  FINDING  H:  B road -Sea  1 e  Ba 1 1 1 e  Environment.  The  GAO  observed 
that  any  model  oT  modern  warfare  shou Id  address  the  critical  aspects 
!  of  that  warfare--m  the  air  defense  tactical  areas  this  includes 

three  dimensions- -the  air  war,  the  ground  war,  and  the  interaction 
|  of  the  two.  The  GAO  found  differing  approaches  among  the  models  in 

the  coverage  given  to  various  aspects  of  modern  warfare  and  how  they 
interact  with  one  another.  The  GAO  further  found  that  all  of  the 
I  models  nave  serious  weaknesses  in  the  portraval  of  at  least  one 

critical  aspect  of  the  air  defense  combat  areas. 

I  --  the  ADAGE  and  COMO  give  inadequate  consideration  to  the  effect 

of  ground  war  activities  on  air  defense  weapons  and  they  do  not 
|  completely  portray  the  interaction  of  air  and  ground  activities;  and 

the  Carmonette's  treatment  of  the  air  war  is  incomplete  since 
it  consistently  failed  to  include  fixed  wing  aircraft  effects  and 
|  only  recently  addressed  these  aircraft  even  in  an  indirect  manner. 

The  GAO  concluded  that  ADAGE  and  COMO  strengths  lie  in  the  portrayal 
of  ground  activities.  The  GAO  further  concluded  that  all  of  the 
;  models  have  certain  strengths  in  dealing  with  critical  aspects  of 

air  defense  weapons,  but  all  of  them  also  have  serious  weaknesses. 
Now  pages  35-36  and  83-87  I  (pp.  4-7  to  7-8;  pp.  33-4,1,  Appendix  ll/GAO  Draft  Report! 

I  DOD  RESPONSE;  Concur. 

The  strengths  and  weaknesses  of  tile  ADAGE,  COMO  111  and  Carmonette 
|  models  are  noted  for  the  weapon  systems  reviewed.  The  seriousness 

of  the  weaknesses  is  related  to  the  use  of  the  results,  which  is 
!  appropriate.  (See  Doli  response  to  Finding  A). 
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Now  pages  36-37  and  87  91 
See  comment  3 
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o  FINDING  J:  Mathematical  And  Logical  Representatives  Used  To 
Depict  Combat-!  The  GAO  observed  that  another  critical  area  of  i 

concern  in  modeling  weapon  system  operational  effectiveness  is  how 
the  theory  and  phenomena  are  mathematically  and  logically 
represented.  The  GAO  found  three  areas  of  concern  about  ADAGE: 

l 

its  expected  value  approach  used  for  modeling  the  ; 

engagement  of  a  multiple  air  defense  weapon  against  j 

multi-plane  attacks; 

its  use  of  probability  of  participation  of  air  defense 
weapons  ;  and 

its  apparent  exaggeration  of  D1VAD  survivability. 

The  GAO  further  found  that  Carmonette's  basic  mathematical 
formulations  of  fixed  wing  aircraft  engagements  are  not  much 
different  from  ADAGE;  moreover,  both  of  these  models  have  other 
ma t Hema t ic a  1 / log ica 1  problems,  which  though  not  as  serious, 
nevertheless  threaten  the  credibility  of  model  results.  The  GAO 
reported  that  only  COMO  III  appears  to  be  free  of  serious 
mat hema t i ca 1 / 1 og i ca 1  problems  in  the  model  structure.  The  GAO 
concluded  that  the  three  models,  the  ADAGE  had  the  greatest  number 
of  mathematical  and  logical  flaws,  raising  concerns  about  the 
credibility  of  its  results.  (pp.  4-J  to  4-9;  pp.  47-49,  Appendix 
1  1  GAO  Draf  t  Report ) 

DUD  RESPONSE:  Fart  tally  concur. 

The  expected  value  approach  is  not  intrinsically  bad.  Also,  ! 

simulation  models  must,  of  necessity,  limit  replication  to  match 

run-time  constraints,  costs  and  expected  input  data.  j 

o  FINDING  K:  Appropriateness  Of  Input  Factors.  The  GAO  observed  I 

that,  since  the  whole  simulation  can  falter  when  input  data  are  not 
clearly  relevant,  complete  information  about  the  data  is  necessary. 

Tile  GAO  found  that  all  of  the  models  use  recognized  data  sources. 

(he  GAO  observed  that  the  Carmonette  and  COMO  III  models  are 
relatively  strong  with  respect  to  the  appropriateness  of  data.  In 
addition,  the  GAO  observed  that,  in  the  earlier  analyses,  ADAGE  and 
Carmonette  modelers  differed  in  the  selection  of  input  data  and 
detection  models  for  visual  detection  of  approaching  aircraft; 
however,  the  compromise  position  reached  resulted  in  the  use  of  data 
that  did  not  properly  describe  DIVAU  detection  capabilities.  The 
GAO  further  found  that,  with  regard  to  data  reflecting  the  real 
world,  the  ADAGE  simulation  had  tne  most  serious  limitations  because 
some  of  its  data  were  outdated  and  some  key  values  were  too  large  to 
be  accepted  by  knowl edga b 1 e  military  officials.  In  addition,  the 
GAO  found  that  ADAGE  modelers  included  weapon  system  characteristics 
as  an  integral  part  of  the  model  rather  than  addressing  them  through 
an  external  data  base,  which  is  contrary  to  the  Joint  Forward  Area 
Defense  (JFAAD)  Test  Force  suggested  model  requirements.  The  GAO 
noted  that,  while  some  of  the  Carmonette's  early  data  problems  were 
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Now  pages  37-38  and  91  95 
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Now  pages  40  and  46 


corrected,  the  problems  with  disputed  visual  detection  data 
remained.  The  GAO  also  found  that  the  Carmunette  and  COMO  111 
simulations  were  limited  in  data  handling  because  of  the  extensive 
data  tailoring  required.  The  GAO  concluded  that  ADAGE  and 
Carmonette  both  experienced  problems  with  obtaining  appropriate  data 
and  these  problems  were  related  to  the  credibility  of  simulation 
results.  The  GAO  further  concluded  that  because  Carmonette  and  COMO 
III  require  extensive  data  tailoring,  the  effects  of  data  tailoring 
cannot  be  easily  distinguished  from  model  manipulations.  (pp.  4-9 
to  4-11,  pp.  49-58,  Appendix  1 1 / GAO  Draft  Report) 

POD  RESPONSE:  Concur. 

Data  tailoring,  however,  is  a  useful  tool  in  examining  model  or 
system  sensitivity. 

o  FINDING  L;  The  Credibility  Of  Selected  Simulations  Based  Upon 
The  Correspondence  Between  A  Model  and  The  Real  World.  The  GAO 
found  that  the  f ac  tors  t hat  address  the  cred  ibi  1  i  ty  of  the 
simulation  (based  upon  correspondence  between  the  model  and  tile  real 
world)  are--(l)  evidence  of  a  verification  effort,  (2)  evidence  that 
the  results  are  statistically  representative,  (3)  evidence  of 
sensitivity  testing,  and  (4)  evidence  of  model  validation.  The  GAO 
observed  that  while  analysts  can  never  provide  absolute  guarantees 
regarding  model  credibility  or  output  accuracy,  several  steps  can  be 
undertaken  to  determine  whether  a  simulation  is  sufficiently  close 
to  representing  the  operation  of  an  actual  weapon  system.  These 
steps  include  analvses  to  produce  evidence  that: 

the  computer  program  operates  in  the  manner  intended  by  the 
simulation  model's  designers; 

the  output  of  the  simulation  is  sufficiently  representative  of 
what  the  average  model  output  would  be  over  many  runs; 

the  sensitive  model  parameters  and  alternative  scenarios  are 
properly  accounted  for  by  the  simulation  results;  and 

the  simulation  results  are  a  sufficiently  accurate 
representation  of  what  the  real  world  results  would  be. 

In  general,  the  GAO  concluded  that  efforts  to  directly  validate 
simulation  results  by  comparison  to  weapon  effectiveness  results 
derived  by  other  means  were  very  weak,  requiring  substantial  work  to 
increase  credibility.  The  GAO  further  concluded  that  more 
enhancement  of  credibility  could  have  been  achieved  by  more 
intensive  efforts  to  document  the  verification  of  the  computer 
representatives  of  the  real  world,  and  to  establish  that  the 
simulation  results  were  statistically  representative,  or  to  directly 
validate  simulation  results  by  comparison  to  weapon  effectiveness 
results  derived  by  other  means.  In  addition,  the  GAO  further 
concluded  that  the  strongest  contribution  to  credibility  probably 
came  from  ‘■fforts  to  test  the  parameters  of  models  and  to  run  the 
models  wi in  alternative  scenarios.  (pp.  5-1  to  5-2;  p.  5-11  GAO 
Draft  Report ) 
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j  POD  RESPONSE:  Concur. 

Simulations  can  also  be  used  to  examine  trends.  It  should  be 
recogniced  that  real  world  data  is  not  always  available.  As  an 
example,  treaties  may  preclude  testing.  In  some  instances  these 
trends  may  be  more  revealing  than  real  world  data  in  assessing  the 
effectiveness  of  a  weapon  system.  In  addition,  there  are  cases 
where  the  model  user  deviates  from  the  "real  world”  for  good  and 
sufficient  reasons.  The  credibility  of  results  to  the  new  user  can 
be  biased  by  good  documentation  that  provides  strengths  and 
limitations,  as  well  as  technical  instructions.  DOD-STD-2167 
provides  for  documentation  to  be  included  in  software  design  and  is 
as  applicable  to  simulation  models  as  to  weapons  system  design. 
Strengths  and  weaknesses  that  address  non-valid  uses  would  be  more 
appropriate,  when  applied  to  large  scale  simulations. 

o  FINDING  M:  Evidence  Of  A  Verification  Effort.  The  GAO 
observed  t  ha  t  verif ication  refers  to  t  he  process  of  determining  if 
the  computer  program  correctly  represents  the  theory,  model  design, 
and  input  data.  The  GAO  found  that  no  documentary  evidence  of 
verification  was  available  for  either  ADAGE  or  Carmonette.  While 
there  was  no  evidence  of  verification  efforts  on  Carmonette,  the  GAO 
noted  tha'  it  was  informed  that  Carmonette  had  been  subjected  to 
extensive  peer  reviews.  The  GAO  further  found  that  it  could  not 
identify  any  verification  of  the  COMO  III  STINGER  simulation  models 
or  tile  variant  that  was  developed  for  the  battery  coolant  unit  IBCUj 
analysis.  The  GAO  concluded  simulation  users  are  entitled  to  some 
knowledge  of  the  verification  efforts  that  were  involved  in 
simulation  development  and  that  such  information  will  strengthen  the 
credibility  of  simulations.  file  GAO  further  concluded  that  lack  of 
documented  evidence  of  verifications  presents  a  clear  threat  to  the 
\c.  a  pages  40  41  and  1 13  14  credibility  of  the  simulations.  ipp.  5-.’.  to  5-3;  pp.  30-31, 

Appendix  111  GAO  Draft  Report) 

DOD  RESPONSE:  Concur. 

The  more  a  model  is  used,  the  more  it  is  subject  to  peer  review. 

The  real  problem  is  documentation,  which  is  a  limitation  throughout 
the  modeling  world.  On  simulations  used  for  the  first  few  times, 
this  finding  mac  he  an  accurate  statement,  but  depends  on  the 
expel  1  1  se  of  the  user.  1  lie  statement  becomes  less  and  less  accurate 
as  validated  data  is  provided.  Long  term  credibility  lies  with 
consistent  data,  validated  w  1  t  li  actual  measured  data.  DoDD  5000.3 
requlies  a  verification  and  validation  effort. 

o  F 1NDING  N :  Evidence  That  The  Results  are  Statistically 
Representative.  The  GAO  observed  t ha t  the  credibility  of  simulation 
results  is  enhanced  when  users  are  assured  that  simulation  outputs 
are  representative  of  how  the  model  will  perforin  during  repeat 
runs.  The  GAO  observed  mat,  within  ADAGE,  the  incursion  submodel 
is  the  only  model  using  the  Monte  Carlo  modeling  technique.  The  GAO 
noted  that,  according  to  analysts  who  worked  with  ADAGE,  each 
incursion  scenario  had  been  run  five  liundred  times  and  the  resultant 
mean  was  within  one  or  two  percent  of  the  true  mean  at  the  NR 
percent  confidence  level.  The  GAO  further  found  that,  for 
Carmonette,  analysts  addressed  the  statistical  representatives 
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factor,  but  with  only  limited  success.  As  an  example,  the  ■> Ai ■  cit'-d 
that  the  COMO  111  STINGER  simulation  made  only  one  run  for  each 
scenario,  and  the  report  did  not  address  the  statistical 
representativeness  ot  the  results.  The  GAO  also  tuund  that  the  COMO 
III  STINGER  simulation  appeared  not  to  have  addressed  the  need  foi 
developing  statistically  representative  model  values,  which  the  GAu 
concluded  represents  an  outright  threat  to  the  credibility  of  the 
simulation.  fne  1  oiiger  -  runn  i  ng  ,  more  complex  simulations  were 
evaluated  witn  fewer  simulation  runs;  therefore,  the  GAO  concluded 
that  if  this  represents  a  tendency  to  treat  the  results  of  one  or  a 
few  runs  of  a  complex  model  as  being  the  "True"  estimates  for  the 
simulation  model,  this  situation  has  the  potential  to  create 
Now  pages  4 1  -42  46  ana  1  '  3  substantial  credibility  questions.  (.  p  p .  5-3  to  5-5a,  p.  3-10;  p . 

31,  Appendix  11  i  "Ao  Draft  Report) 

Dub  Rfc.sFuNsl  Go ik  u i  . 

The  staiisti.il  vali  1 1 1  v  should  always  be  addressed.  Renew  groups 
ai>  in  plan*  t--  accomplish  this. 

o  FISTING  0 :  Evidence  Of  Sens l t i v i t y  Testing.  The  GAu  observed 
that  this  TT7T  ji  a  J  .1  r>*  s  st-  s  the  nee  J  for  simulation  analysts  and 
users  to  d-velop  an  understanding  of  how  changes  ill  key  parameters 
affect  tn**  model  Jesuits.  The  GAO  observed  that  sensitivity  testing 
identities  how  changes  in  model  parameters  affect  results  in  both 
direction  and  magnitude  and  provides  the  analysts  expectations  of 
model  hetiav  ;  av  .  \  he  GAty  found  that  ADAGE  analysts  used  both 

parametei  testing  and  experimentation  with  alternative  scenarios  to 
examine  simulation  results  for  both  small  and  major  changes.  The 
GAO  further  found  that  the  credibility  of  both  Carmonette  and  COMO 
III  also  benefited  from  the  use  of  alternative  scenarios  and 
parameter  testing.  The  GAO  found  that  sensitivity  testing  was  a 
factor,  for  all  three  simulations,  which  contributed  to  a 
strengthening  of  the  credibility  of  the  models.  The  GAO  observed 
that  the  apparent  need  is  to  integrate  parameter  and  scenario  work 
with  the  worn  performed  on  determining  the  true  estimates  for  the 
simulation.  The  GAO  concluded  that  valuable  information  that 
contributed  to  simulation  result  credibility  was  developed  in  ADAGE, 
Carmonette  and  COMO  111  by  varying  parameters  and  testing 
Now  pages  103-06  and  1 14  [  alternative  scenarios.  (pp.  10-10,  p.  3  T ,  Appendix  111  GAO  Draft 

Report ) 

POD  RESPONSE:  Concur. 

o  FINDING  P:  Evidence  Of  Model  Validation.  The  GAO  observed 
that  validation  is  the  process  of  determining  that  a  model  is  an 
accurate  representation  of,  or  is  in  agreement  with,  the  real  world 
system  being  modeled.  The  GAO  observed  that  validations  are  not 
planned  for  or  conducted  routinely  but  are  more  likely  to  be 
performed  when  a  disparity  in  results  is  found  among  different 
performance  estimating  methods.  The  GAO  found  no  validation  efforts 
had  been  performed  on  ADAGE  or  Carmonette  using  real  world  D1VAD 
data.  (The  GAO  noted  that  this  is  not  intended  to  suggest  that 
there  was  no 
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Now  pages  44-46 
and  1 14 


Now  pages  47-49 


attempt  at  validation.)  The  GAO  further  found  no  evidence  of 
validation  specifically  for  the  COMO  III/STINGER  simulation; 
however,  evidence  was  found  of  an  effort  to  val:date  the  general 
COMO  model  by  comparing  COMO  results  to  those  from  an  Air  Force 
model  called  SORTIE.  The  GAO  observed  that  the  results  suggest  that 
model -to-model  validation  can  marginally  strengthen  the  credibility 
of  a  model,  especially  when  comparisons  with  real  world  data  are 
lacking.  The  GAO  generally  concluded  that  efforts  to  directly 
validate  simulations  results  by  comparison  to  weapon  effectiveness 
results  derived  by  other  means  are  very  weak  and  require  substantial 
work  to  increase  credibility.  The  GAO  further  concluded  that 
validation  based  on  a  model -to-model  comparison  contributes 
substantially  to  model  credibility  and  should  be  performed  as  a 
normal  part  of  the  simulation  cycle.  (pp.  5-8  to  5-11;  p.  32, 
Appendix  III/GAO  Draft  Report) 

POD  RESPONSE:  Concur. 

Va 1 idat ion  Tndicates  end  use.  It  should  be  noted,  however  that 
prior  to  the  validation,  trends  are  usually  more  appropriate. 

o  FINDING  Q:  The  Support  Structures  Established  To  Manage  The 
Simulation  Design,  Data,  And  Operating  Requirements  The  GAO 
observed  that  institutional  practices  or  mechanisms  can  help  to 
ensure  that  credible  simulations  are  established  and  maintained. 

Tne  GAO  found  two  such  practices  in  reviewing  ADAGE,  Carmonette,  and 
COMO  III  were  configuration  management  and  the  use  of  oversight  and 
review  groups.  The  GAO  reported  that  each  of  the  models  had  been 
assigned  to  an  agency  for  management;  ADAGE  to  the  U.S.  Army 
Materiel  System  Analysis  Activity  (AMSAA);  Carmonette  to  the  U.S. 
Army  Training  and  Doctrine  Command  (TRADOC)  Systems  Analysis 
Activity;  and  COMO  111  to  the  U.S.  Army  Missile  Command.  The  GAO 
noted  that  the  TRADOC,  which  plays  a  role  in  both  managing  and  using 
simulation  models,  illustrates  configuration  management  support. 

The  GAO  further  found  that,  in  an  effort  to  establish  oversight  and 
review  at  different  levels,  the  TRADOC  established  Study  Advisory 
Groups  (SAGs)  to  monitor  the  progress  of  individual  studies  using 
models  under  TRADOC  control.  The  GAO  concluded  that  the  Army  has 
been  at  least  partially  successful  in  establishing  mechanisms  to 
maintain  simulation  models  and  to  control  their  development  and 
use.  While  formal  control  responsibilities  were  assigned  for  each 
case  study  model,  the  existence  of  several  stakeholder  groups  with 
various  roles  to  play  may  indicate  an  immature  and  still  evolving 
structure;  however,  the  GAO  further  concluded  that  the  present  mix 
may  be  appropriate  as  a  permanent  structure,  which  recognizes  the 
diffuse  Army  interests  in  simulation.  (pp.  b-1  to  0-4/GAO  Draft 
Report ) 

POD  RESPONSE;  Concur. 

The  management  procedures  evidenced  and  reported  by  the  GAO  are 
appropriate  to  the  continuing  credibility  of  these  simulation  models 
and  their  modifications.  However,  as  vindicated  earlier,  these 
procedures  will  not  ensure  valid  data. 
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o  FINDING  R:  The  Documentation  Needed  By  Persons  Using  The 
Simulation  Or  ItsHTesults.  The  GAO  observed  that  we  1 1 -document ed 
s i mu  la t ion  model s  insp i re  confidence  that  the  models  will  be 
correctly  used  to  address  the  types  of  issues  for  which  they  were 
designed.  The  GAO  found  that  the  ADAGE  was  a  relatively 
we  1 1 -documented  model,  at  least  through  September  1978.  The  GAO 
further  found  that  Cannonette  is  a  relatively  poorly  documented 
j  model,  which  became  evident  during  the  Cost  and  Operational 

Effectiveness  Analysis  (COEA)  Update  Study,  when  analysts  at  the 
USAADASCH  tried  to  reconcile  disparities  in  the  results  produced  by 
ADAGE  and  Carmonette.  The  GAO  reported  that  concern  about  the  lack 
of  Carmonette  documentation  was  also  expressed  by  the  Chairman  of 
the  Study  Advisory  Group  charged  witli  overseeing  the  COEA  Update  for 
DIVAD.  The  GAO  found  that  extensive  documentation  exists  for  the 
COMO  series  of  models.  The  GAO  concluded  that  in  the  case  of  COMO 
III,  and  to  a  lesser  extent  for  ADAGE,  the  documentation  tends  to 
strengthen  the  confidence  of  the  user  in  the  credibility  of  the 
simulation.  On  the  other  hand,  the  GAO  concluded  that  the 
considerable  lack  of  documentation  for  Carmonette  detracts  from  the 
confidence  that  a  prospective  user  might  have  in  its  credibility. 

Now  pages  49-51  and  53  ,  1pp.  6-4  to  p.  6-6,  6-10/GA0  Draft  Report! 

POD  RESPONSE:  Concur. 

(See  DoD  response  to  Recommendation  7). 

o  FINDING  S  Disclosure  Of  The  Simulation  Strengths  and 
I  Weaknesses .  The  (jAO  examined  several  reports  :  (1)  for  ADAGE--the 

1977  DIVAD  COEA  Report,  the  1984  COEA  Update  and  the  19BS  DIVAD 
Comparative  Analysis;  (7)  for  Ca rmone 1 1 e - - 1 he  1  984  COEA  Update;  and 
the  1  985  Comparative  analysis;  and  15)  for  COMO  111  —  t  he  STINGER 
Battery  Coolant  Unit  Study  Report,  a  validation  report  for  PATRIOT 
missiles  studies  and  documentation  for  the  STINGER  model.  The  GAO 
found  that: 

the  ADAGE  reports  contained  explicit  statements  of  the  study- 
objectives  and  the  strengths  and  limitations  of  the  particular 
simulation; 
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Now  pages  49-53 


Now  pages  54-55 


the  Carmonette  1984  COEA  Update  appears  to  make 
recommendations  that  are  not  wel 1 -supported  by  simulation  results. 
Little  or  no  attention  is  given  to  the  theoretical  basis  of  the 
analyses;  and 

the  report  on  STINGER  Battery  Coolant  Unit  Study  clearly 
developed  the  rational  for  the  scenarios  and  identified  limitations 
due  to  both  the  computer  and  the  model.  One  limitation  of  the 
report  was  the  implicit  assumption  that  the  submodel  for  another  air 
defense  weapon  being  simulated  within  COMO  111  was  sufficiently 
credible  and  accurate  and  that  the  overall  results  would  not  be 
bi  ased . 

Tne  GAO  concluded  that,  while  reporting  practices  could  be  improved, 
they  contributed  to  the  credibility  of  all  three  simulations.  (pp. 
6-4  to  6-11/GAO  Draft  Report) 

POD  RESPONSE:  Concur. 

Reporting  practices  contribute  to  the  credibility  of  the  results. 
Again,  it  should  be  noted  that  strengths  and  weaknesses  need  to  be 
relative  to  the  use  of  the  simulation  data. 

o  FINDING  T:  OSD  LEVEL  GUIDANCE.  The  GAO  found  no  formal 
guidance  specifically  for  simulation  at  the  OSD  level;  however, 
related  OSD-level  regulations  that  included  concepts  which  could  be 
applied  to  computer  simulations  did  exist.  Specifically,  the  GAO 
reported  that : 

the  need  for  information  and  the  use  of  analysis  to  support 
weapon  system  acquisition  decisions  is  stated  in  that  some  form  of 
system  effectiveness  analysis,  in  conjunction  with  analyses  of  costs 
and  other  factors,  shall  be  performed  to  support  milestone 
decisions;  and 

the  test  and  evaluation  directive,  DoDD  5000.3,  states  that 
the  use  of  properly  validated  analysis,  modeling,  and  simulation  is 
strongly  encouraged,  especially  during  early  development  phases. 

The  GAO  concluded,  however,  that  while  the  above  directives 
encourage  the  use  of  validated  simulations,  they  do  not  give 
guidance  on  prerequisites  for  sound  simulations,  on  how  to 
development  them,  nor  how  to  validate  them.  The  GAO  noted  that  the 
Automatic  Data  Processing  (ADP)  and  Information  Resources  Management 
1  1RM)  regulations  may  be  applicable,  at  least  in  part,  because 
simulations  are  run  on  computers;  however,  the  GAO  found  that  these 
directives  focus  largely  on  input/output  processing  and  file 
structure.  The  GAO,  therefore,  further  concluded  that  while 
directives  or  standards  in  this  area  are  useful,  they  are  inadequate 
to  guide  the  development  and  maintenance  of  computer  simulations, 
(.pp.  l-l  to  7-3/GAO  Draft  Report) 
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Now  pages  55-56  ana  59 


i 


i 

I 

Now  pages  57-59 


DOD  RESPONSE:  Concur. 

The  DOD  provided  the  14  factors  to  the  Services  for  review  and 
evaluation  on  August  4,  1987.  The  Defense  Systems  Management 
College  included  the  use  of  simulation  results  in  the  new  course  on 
test  and  evaluation  management  beginning  on  December  14,  1987.  The 
OT&E  Commanders'  Conference  held  August  1987  reviewed  this  same 
subject.  (See  DoD  response  to  Recommendation  1.) 

o  FINDING  U :  The  Software  Quality  Issue.  The  GAO  observed  that, 
while  the  interest  in  sof tware  qua  1 1 ty  began  witli  weapon  system 
software  applications,  it  may  be  generalized  to  all  computer 
systems.  The  GAO  found,  however,  no  substantial  interest  in,  or 
recognition  of,  the  importance  of  the  issue  of  software  quality. 

The  GAO  recognized  that  some  arguments  can  be  made  against 
designing,  programming  and  testing  software  to  satisfy  the 
established  quality  standards  for  some  simulations  that  are  small, 
short-lived,  limited  purpose  applications.  On  the  other  hand,  the 
GAO  pointed  out  there  is  a  class  of  simulations  that  are  long-lived, 
that  develop  a  community  of  users,  and  are  intensive  consumers  of 
computer  resources.  The  GAO  observed,  therefore,  that  over  their 
lifetime  these  simulation  systems  will  be  intensive  users  of 
manpower  and  computer  resources  and,  additionally,  their  results  may 
influence  major  decisions.  The  GAO  concluded  that  more  attention  by 
management  to  the  technical  aspects  of  modeling,  such  as  software 
quality,  statistical  analysis,  and  validation,  would  encourage  the 
greater  adoption  of  practices  to  assess  and  improve  simulation 
credibility.  (pp.  7-3  to  7-5,  p.  7-11/GAO  Draft  Report) 

POP  RESPONSE:  Concur. 

The  standard  is  scheduled  to  be  issued  by  March  1988.  The  DOD 
Standard  2168  addresses  this  issue  and  is  in  final  review. 

o  FINDING  V :  Service  Level  (Army)  Regulations  And  Practices . 

The  GAO  found  that  at  the  Service  level,  the  Army  ha s  i s sued 
regulations  that  address  (1)  the  management  of  models  within  the 
context  of  the  Army  Model  Improvement  Program  and  (2)  the  management 
of  studies  and  analyses,  of  which  models  are  a  component.  The  GAO 
reported  that  a  major  Army  effort  to  develop  a  comprehensive 
hierarchical  modeling  system,  reflecting  the  guidance  of  the  Army- 
Models  Committee,  was  formalized  in  the  issuance  of  the  Army 
Regulation  5-11  (AR5-11),  Army  Model  Improvement  Program.  (The  GAO 
noted  that  the  purpose  of  the  program  is  to  evaluate  combat 
capabilities  and  determine  resource  requirements  through  an 
integrated  system  of  models  operating  at  the  theater  force, 
corps/d l v i s l on ,  and  combined  arms  and  support  task  force  levels.) 

The  GAO  found  AR5-11  to  be  the  most  detailed  statement  issued  by  the 
Army  regarding  modeling  policy  and  practice  among  the  documents 
reviewed.  The  GAO  further  reported  that  the  TRADOC  provides 
specific  guidance  on  managing  model  ,  and  on  using  and  reporting  on 
simulations  as  part  of  studies.  The  GAO  further  reported  that,  in 
addition  to  issuing  regulations  as  a  means  of  guiding  studies  and 
models,  the  Array  has  established  various  groups  to  address  technical 
and  management  aspects  of  the  studies  and  modeling  process.  The  GAO 
concluded  that  overall,  the  Army  analytical  community  appears  to  be 
concerned  about  the  quality  of  its  models  and  its  responsibility  to 
provide  guidance  fur  mo. lei  management.  (pp.  7-5  to  7-10/GAO  Draft 
Repo  r  t ) 
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DOD  RESPONSE:  Concur. 

RECOMMENDATIONS 

o  RECOMMENDATION  1:  Ttie  GAO  recommended  that  the  Secretary  of 
Defense  adopt  or  develop  and  implement  guidance  on  producing, 
validating,  documenting,  managing,  maintaining,  using,  and  reporting 
weapon  system  effectiveness  simulations.  In  the  GAO  view,  this 
guidance  should  include  a  mechanism  to  routinely  provide  reviews  of 
a  simulation's  credibility  and,  in  this  way,  identify  problems  that 
need  to  be  resolved.  The  GAO  also  suggested  that  the  OSD  should 
explore  including  a  requirement  for  a  statement  regarding  validation 
Now  page  5  efforts  to  accompany  simulation  results.  (p.  9/GAO  Draft  Report) 

POD  RESPONSE:  Concur. 

The  GAO  has  addressed  an  area  of  concern  to  the  DOD.  As  indicated 
earlier,  the  14  factors  provided  were  forwarded  to  the  Services  on 
August  4,  1987.  The  inclusion  of  simulation  modeling  and  simulation 
results  in  the  0T8E  Commanders'  conference  and  in  the  new  DSMC 
course  are  additional  evidence  of  the  importance  the  DOD  is  giving 
to  this  area.  Specific  guidance  from  these  initiatives  will  be 
considered.  The  DOD  will  provide  specific  inputs  on  this 
recommendation  within  6  months.  (See  DoD  response  to  Finding  Tj. 

o  RECOMMENDATION  /:  The  GAO  recommended  that  the  Agencies 
responsible  for  managing  the  ADAGE,  Carmonette,  and  COMO  models 
explore  the  feasibility  of  and,  where  indicated,  take  actions  to 
correct,  the  limitations  tne  GAO  has  l  den t i f 1 ed  - -e spec i a  1  1  y  in  the 
Now  page  5  validation  area.  (p.  9/GAO  Draft  Report) 

POP  RESPONSE:  Concur. 

The  Array  management  process  for  the  use  of  these  models  is 
continuing  to  see  that  the  models  are  used  properly  and  are  updated 
and  corrected  when  and  where  necessary.  This  action  will  be 
reported  with  the  inputs  on  Recommendation  1. 
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Tin'  following  an*  uao’s  comments  on  the  October  (>,  1987.  Department 
of  Defense  letter. 


GAO  Comments 


1 .  Our  response  to  pop's  letter  is  presented  in  chapter  8.  We  have  also 
included  in  the  final  report  additional  information  about  our  objectives 
in  chapter  1  to  address  pop's  concerns  about  the  scope  and  focus  of  our 
draft  report. 

2,  We  have  pointed  out  in  the  report  that  the  validation  of  simulations  is 
a  difficult  problem,  and  we  have  only  suggested  that  more  efforts  in  this 
area  might  he  taken.  Achieving  a  "total  level  of  validation"  is  not  likely 
over  to  bo  [Kissible,  but  we  believe  incremental  improvements  can  be 
made.  We  agree  that  the  persons  involved,  a  simulation’s  applications, 
and  the  type  of  data  input  also  should  be  considered  in  assessing  credi¬ 
bility.  and  we  believe  these  are  considered  within  our  framework:  per¬ 
sons  under  factor  12.  simulation  application  under  factor  1,  and  input 
data  under  factor  7. 

:l.  Wo  did  not  mean  to  imply  that  the  expected-value  approach  is  intrin¬ 
sically  bad  However,  wo  point  out  several  limitations  that  resulted  from 
the  use  of  this  approach  in  the  apauk  Campaign  submodel.  non  personnel 
and  exjM'rieneod  models  practitioners  also  pointed  out  the  concerns  that 
we  raised.  Our  criticisms  in  this  area  were  tempered  by  other  statements 
in  the  report  |>ointing  out  strengths  of  the  At tAGK  model.  For  example,  we 
noted  that  the  apauk's  theoretical  approach  was  appropriate  for 
addressing  decisions  concerning  competing  air  defense  weapons. 
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Glossary 


Air-To-Air  Exchange  Ratio 

The  proportion  of  enemv  aircraft  to  friendly  aircraft  expected  to  be 
destroyed  in  a  large  series  of  one-on-one  air  combat  encounters. 

Battery  Coolant  Unit 

A  component  used  to  cool  the  electronics  systems  of  the  Stinger  missile. 

Benchmark 

A  critical  comparison  of  the  results  of  one  simulation  model  with  those 
of  another. 

Configured  Encounter 

An  interaction  between  hostile  forces  in  which  the  geometry  of  the  situ¬ 
ation  is  a  component  of  the  analysis  of  outcomes. 

Data  Tailoring 

Making  significant  adjustments  to  raw  data  so  that  they  can  be  used  in  a 
particular  application. 

Deterministic  Model 

A  model  that  uses  expected  values  rather  than  distributions. 

Fixed-Wing  Aircraft 

All  aircraft  except  helicopters. 

Free  Encounter 

An  interaction  between  hostile  forces  in  which  the  geometry  of  the  situ¬ 
ation  is  not  considered  in  the  analysis  of  outcomes. 

Hardware-in-the-Loop 

A  form  of  simulation  that  incorporates  components  of  the  actual 
weapon  system. 

High  Resolution 

Compared  to  low  resolution,  the  consideration  of  large  number  of  fac¬ 
tors  in  simulation. 

Intervisibility 

Lines  of  visibility  where  terrain  must  be  considered  between  a  threat 
and  a  target. 
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Glossary 


Jinking 

Making  relatively  small  but  abrupt  movements  in  three  dimensions,  as 
might  be  expected  from  a  helicopter  trying  to  avoid  enemy  fire. 

Low  Resolution 

See  High  resolution. 

Man-in-the-Loop 

A  form  of  simulation  that  irieorj>oratoN  the  human  operator  of  the 
weapon  system 

Model 

A  set  of  mathematical  or  logical  relationships  that  describe  how  a  sys¬ 
tem  works  and  behaves 

Monte  Carlo  Simulation 

Any  simulation  involving  the  use  <>t  random  numbers. 

Radar  Signature 

The  characteristics  of  electromagnetic  waves  reflected  from  a  target 
that  has  been  subjected  to  a  radar  beam. 

Replication 

The  repetition  of  a  simulation  using  different  random  numbers. 

Sensitivity  Testing 

Determining  if  a  model  behaves  as  expected  when  one  or  more  input 
variables  are  changed. 

Simulation 

A  computer  program  that  imitates  the  operations  of  various  kinds  of 
real-world  facilities  or  processes. 

Stochastic  Model 

A  model  that  uses  random  variables  defined  within  a  common  sample 
space. 

Structured  Walk-Through 

An  organized  procedure  for  reviewing  the  quality  and  accuracy  of  a 
computer  program. 
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Glossary 


Validation 


Verification 
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Determining  that  a  model  is  an  accurate  representation  of  the  real  sys¬ 
tem  by  comparing  the  model’s  output  to  that  of  the  actual  system  or 
substitutes  for  it. 


Determining  that  the  computer  program  prepared  for  a  simulation 
model  is  performing  properly. 
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